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Description 

TECHNICAL FIELD OF THE INVENTION 

5 [0001] The invention relates to methods and compositions for the prediction, diagnosis, prognosis, prevention and 
treatment of neoplastic disease. Neoplastic disease is often caused by chromosomal rearrangements which lead to 
over- or underexpression of the rearranged genes. The invention discloses genes which are overexpressed in neo- 
plastic tissue and are useful as diagnostic markers and targets for treatment. Methods are disclosed for predicting, 
diagnosing and prognosing as well as preventing and treating neoplastic disease. 

w 

BACKGROUND OF THE INVENTION 

[0002] Chromosomal aberrations (amplifications, deletions, inversions, insertions, translocations and/or viral inte- 
grations) are of importance for the development of cancer and neoplastic lesions, as they account for deregulations 

15 of the respective regions. Amplifications of genomic regions have been described, in which genes of importance for 
growth characteristics, differentiation, invasiveness or resistance to therapeutic intervention are located. One of those 
regions with chromosomal aberrations is the region carrying the HER-2/neu gene which is amplified in breast cancer 
patients. In approximately 25% of breast cancer patients the HER-2/neu gene is overexpressed due to gene amplifi- 
cation. HER-2/neu overexpression correlates with a poor prognosis (relapse, overall survival, sensitivity to therapeu- 

20 tics). The importance of HER-2/neu for the prognosis of the disease progression has been described [Gusterson et 
al., 1992, (1)]. Gene specific antibodies raised against HER-2/neu (Herceptin™) have been generated to treat the 
respective cancer patients. However, only about 50% of the patients benefit from the antibody treatment with Hercept- 
in™, which is most often combined with chemotherapeutic regimen. The discrepancy of HER-2/neu positive tumors 
(overexpressing HER-2/neu to similar extent) with regard to responsiveness to therapeutic intervention suggest, that 

25 there might be additional factors or genes being involved in growth and apoptotic characteristics of the respective 
tumor tissues. There seems to be no monocausal relationship between overexpression of the growth factor receptor 
HER-2/neu and therapy outcome. In line with this the measurement of commonly used tumor markers such as estrogen 
receptor, progesterone receptor, p53 and Ki-67 do provide only very limited information on clinical outcome of specific 
therapeutic decisions. Therefore there is a great need for a more detailed diagnostic and prognostic classification of 

30 tumors to enable improved therapy decisions and prediction of survival of the patients. The present invention addresses 
the need for additional markers by providing genes, which expression is deregulated in tumors and correlates with 
clinical outcome. One focus is the deregulation of genes present in specific chromosomal regions and their interaction 
in disease development and drug responsiveness. 

[0003] HER-2/neu and other markers for neoplastic disease are commonly assayed with diagnostic methods such 
35 as immunohistochemistry (IHC) (e.g. HercepTest™ from DAKO Inc.) and Fluorescence-ln-Situ-Hybridization (FISH) 
(e.g. quantitative measurement of the HER-2/neu and Topoisomerase II alpha with a fluorescence-;'n-s;'ft/-Hybridization 
kit from VYSIS). Additionally HER-2/neu can be assayed by detecting HER-2/neu fragments in serum with an ELISA 
test (BAYER Corp.) or a with a quantitative PCR kit which compares the amount of HER-2/neu gene with the amount 
of a non-amplified control gene in order to detect HER-2/neu gene amplifications (ROCHE). These methods, however, 
40 exhibit multiple disadvantages with regard to sensitivity, specificity, technical and personnel efforts, costs, time con- 
sumption, inter-lab reproducibility. These methods are also restricted with regard to measurement of multiple param- 
eters within one patient sample ("multiplexing"). Usually only about 3 to 4 parameters (e.g. genes or gene products) 
can be detected per tissue slide. Therefore, there is a need to develop a fast and simple test to measure simultaneously 
multiple parameters in one sample. The present invention addresses the need for a fast and simple high-resolution 
45 method, that is able to detect multiple diagnostic and prognostic markers simultaneously. 

SUMMARY OF THE INVENTION 

[0004] The present invention is based on discovery that chromosomal alterations in cancer tissues can lead to chang- 
50 es in the expression of genes that are encoded by the altered chromosomal regions. Exemplary 43 human genes have 
been identified that are co-amplified in neoplastic lesions from breast cancer tissue resulting in altered expression of 
several of these genes (Tables 1 to 4). These 43 genes are differentially expressed in breast cancer states, relative to 
their expression in normal, or non-breast cancer states. The present invention relates to derivatives, fragments, ana- 
logues and homologues of these genes and uses or methods of using of the same. 
55 [0005] The present invention further relates to novel preventive, predictive, diagnostic, prognostic and therapeutic 
compositions and uses for malignant neoplasia and breast cancer in particular. Especially membrane bound marker 
gene products containing extracellular domains can be a particularly useful target for treatment methods as well as 
diagnostic and clinical monitoring methods. 
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[0006] It is a discovery of the present invention that several of these genes are characterized in that their gene 
products functionally interact in signaling cascades or by directly or indirectly influencing each other. This interaction 
is important for the normal physiology of certain non-neoplastic tissues (e.g. brain or neurogenic tissue). The deregu- 
lation of these genes in neoplastic lesions where they are normally exhibit of different level of activity or are not active, 
5 however, results in pathophysiology and affects the characteristics of the disease-associated tissue. 

[0007] The present invention further relates to methods for detecting these deregulations in malignant neoplasia on 
DNA and mRNA level. 

[0008] The present invention further relates to a method for the detection of chromosomal alterations characterized 
in that the relative abundance of individual mRNAs, encoded by genes, located in altered chromosomal regions is 
10 detected. 

[0009] The present invention further relates to a method for the detection of the flanking breakpoints of named chro- 
mosomal alterations by measurement of DNA copy number by quantitative PCR or DNA-Arrays and DNA sequencing. 
[001 0] A method for the prediction, diagnosis or prognosis of malignant neoplasia by the detection of DNA sequences 
flanking named genomic breakpoint or are located within such. 

15 [0011] The present invention further relates to a method for the detection of chromosomal alterations characterized 
in that the copy number of one or more genomic nucleic acid sequences located within an altered chromosomal region 
(s) is detected by quantitative PCR techniques (e.g. TaqMan™, Lightcycler™ and iCycler™). 
[0012] The present invention further relates to a method for the prediction, diagnosis or prognosis of malignant ne- 
oplasia by the detection of at least 2 markers whereby the markers are genes and fragments thereof or genomic nucleic 

20 acid sequences that are located on one chromosomal region which is altered in malignant neoplasia and breast cancer 
in particular. 

[0013] The present invention also discloses a method for the prediction, diagnosis or prognosis of malignant neo- 
plasia by the detection of at least 2 markers whereby the markers are located on one or more chromosomal region(s) 
which is/are altered in malignant neoplasia; and the markers interact as (i) receptor and ligand or (ii) members of the 

25 same signal transduction pathway or (iii)members of synergistic signal transduction pathways or (iv) members of an- 
tagonistic signal transduction pathways or (v) transcription factor and transcription factor binding site. 
[0014] Also dislcosed is a method for the prediction, diagnosis or prognosis of malignant neoplasia by the detection 
of at least one marker whereby the marker is a VNTR, SNP, RFLP or STS which is located on one chromosomal region 
which is altered in malignant neoplasia due to amplification and the marker is detected in (a) a cancerous and (b) a 

30 non cancerous tissue or biological sample from the same individual. A preferred embodiment is the detection of at 
least one VNTR marker of Table 6 or at least on SNP marker of Table 4 or combinations thereof.. Even more preferred 
can the detection, quantification and sizing of such polymorphic markers be achieved by methods of (a) for the com- 
parative measurement of amount and size by PCR amplification and subsequent capillary electrophoresis, (b) for 
sequence determination and allelic discrimination by gel electrophoresis (e.g. SSCP, DGGE), real time kinetic PCR, 

35 direct DNA sequencing, pyro-sequencing, mass-specific allelic discrimination or resequencing by DNA array technol- 
ogies, (c) for the dertermination of specific restriction patterns and subsequent electrophoretic separation and (d) for 
allelic discrimination by allel specific PCR (e.g. ASO). An even more favorable detection of a hetrozygous VNTR, SNP, 
RFLP or STS is done in a multiplex fashion, utilizing a variety of labeled primers (e.g. fluorescent, radioactive, bioactive) 
and a suitable capillary electrophoresis (CE) detection system. 

40 [0015] In another embodiment the expression of these genes can be detected with DNA-arrays as described in 
W09727317 and US6379895. 

[0016] In a further embodiment the expression of these genes can be detected with bead based direct flourescent 
readout techniques such as described in WO9714028 and WO9952708. 

[0017] In one embodiment, the invention pertains to a method of determining the phenotype of a cell or tissue, com- 
45 prising detecting the differential expression, relative to a normal or untreated cell, of at least one polynucleotide com- 
prising SEQ ID NO: 2 to 6, 8, 9, 11 to 16, 18, 19 or 21 to 26 or 53 to 75, wherein the polynucleotide is differentially 
expressed by at least about 1 .5 fold, at least about 2 fold or at least about 3 fold. 

[0018] In a further aspect the invention pertains to a method of determining the phenotype of a cell or tissue, com- 
prising detecting the differential expression, relative to a normal or untreated cell, of at least one polynucleotide which 
50 hybridizes under stringent conditions to one of the polynucleotides of SEQ ID NO: 2 to 6, 8, 9, 11 to 16, 18, 19 or 21 
to 26 or 53 to 75 and encodes a polypeptide exhibiting the same biological function as given in Table 2 or 3 for the 
respective polynucleotide, wherein the polynucleotide is differentially expressed by at least at least about 1 .5 fold , at 
least about 2 fold or at least about 3 fold. 

[0019] In another embodiment of the invention a polynucleotide comprising a polynucleotide selected from SEQ ID 
55 NO: 2 to 6, 8, 9, 11 to 16, 18, 19 or 21 to 26 and 53 to 75 or encoding one of the polypeptides with SEQ ID NO: 28 to 
32, 34, 35, 37 to 42, 44, 45 or 47 to 52 or 76 to 98 can be used to identify cells or tissue in individuals which exhibit a 
phenotype predisposed to breast cancer or a diseased phenotype, thereby (a) predicting whether an individual is at 
risk for the development, or (b) diagnosing whether an individual is having, or (c) prognosing the progression or the 
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outcome of the treatment malignant neoplasia and breast cancer in particular. 

[0020] In yet another embodiment the invention provides a method for identifying genomic regions which are altered 
on the chromosomal level and encode genes that are linked by function and are differentially expressed in malignant 
neoplasia and breast cancer in particular. 
5 [0021] In yet another embodiment the invention provides the genomic regions 17q12, 3p21 and 12q13 for use in 
prediction, diagnosis and prognosis as well as prevention and treatment of malignant neoplasia and breast cancer. In 
particular not only the intragenic regions, but also intergenic regions, pseudogenes or non-transcribed genes of said 
chromosomal regions can be used for diagnostic, predictive, prognostic and preventive and therapeutic compositions 
and methods. 

w [0022] In yet another embodiment the invention provides methods of screening for agents which regulate the activity 
of a polypeptide comprising a polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 98 or encoded by a polynu- 
cleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75. A test compound is contacted 
with a polypeptide comprising a polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 98 or encoded by a poly- 
nucleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75. Binding of the test compound 

15 to the polypeptide is detected. A test compound which binds to the polypeptide is thereby identified as a potential 
therapeutic agent for the treatment of malignant neoplasia and more particularly breast cancer. 
[0023] In even another embodiment the invention provides another method of screening for agents which regulate 
the activity of a polypeptide comprising a polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 98 or encoded by 
a polynucleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75. A test compound is 

20 contacted with a polypeptide comprising a polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 98 or encoded 
by a polynucleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75. A biological activity 
mediated by the polypeptide is detected. A test compound which decreases the biological activity is thereby identified 
as a potential therapeutic agent for decreasing the activity of the polypeptide encoded by a polypeptide comprising a 
polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 98 or encoded by a polynucleotide comprising a polynucleotide 

25 selected from SEQ ID NO: 1 to 26 and 53 to 75 in malignant neoplasia and breast cancer in particular. A test compound 
which increases the biological activity is thereby identified as a potential therapeutic agent for increasing the activity 
of the polypeptide encoded by a polypeptide selected from one of the polypeptides with SEQ ID NO: 27 to 52 and 76 
to 98 or encoded by a polynucleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75 in 
malignant neoplasia and breast cancer in particular. 

30 [0024] In another embodiment the invention provides a method of screening for agents which regulate the activity 
of a polynucleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75. A test compound is 
contacted with a polynucleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75. Binding 
of the test compound to the polynucleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 
75 is detected. A test compound which binds to the polynucleotide is thereby identified as a potential therapeutic agent 

35 for regulating the activity of a polynucleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 
75 in malignant neoplasia and breast cancer in particular. 

[0025] The invention thus provides polypeptides selected from one of the polypeptides with SEQ ID NO: 27 to 52 
and 76 to 98 or encoded by a polynucleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 
to 75 which can be used to identify compounds which may act, for example, as regulators or modulators such as 

40 agonists and antagonists, partial agonists, inverse agonists, activators, co-activators and inhibitors of the polypeptide 
comprising a polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 98 or encoded by a polynucleotide comprising 
a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75. Accordingly, the invention provides reagents and 
methods for regulating a polypeptide comprising a polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 98 or 
encoded by a polynucleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75 in malignant 

45 neoplasia and more particularly breast cancer. The regulation can be an up-or down regulation. Reagents that modulate 
the expression, stability or amount of a polynucleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 
and 53 to 75 or the activity of the polypeptide comprising a polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 
98 or encoded by a polynucleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75 can 
be a protein, a peptide, a peptidomimetic, a nucleic acid, a nucleic acid analogue (e.g. peptide nucleic acid, locked 

50 nucleic acid) or a small molecule. Methods that modulate the expression, stability or amount of a polynucleotide com- 
prising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75 or the activity of the polypeptide comprising 
a polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 98 or encoded by a polynucleotide comprising a polynu- 
cleotide selected from SEQ ID NO: 1 to 26 and 53 to 75 can be gene replacement therapies, antisense, ribozyme and 
triplex nucleic acid approaches. 

55 [0026] In one embodiment of the invention provides antibodies which specifically bind to a full-length or partial 
polypeptide comprising a polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 98 or encoded by a polynucleotide 
comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75 or a polynucleotide comprising a polynu- 
cleotide selected from SEQ ID NO: 1 to 26 and 53 to 75 for use in prediction, prevention, diagnosis, prognosis and 
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treatment of malignant neoplasia and breast cancer in particular. 

[0027] Yet another embodiment of the invention is the use of a reagent which specifically binds to a polynucleotide 
comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75 or a polypeptide comprising a polypeptide 
selected from SEQ ID NO: 27 to 52 and 76 to 98 or encoded by a polynucleotide comprising a polynucleotide selected 
5 from SEQ ID NO: 1 to 26 and 53 to 75 in the preparation of a medicament for the treatment of malignant neoplasia 
and breast cancer in particular. 

[0028] Still another embodiment is the use of a reagent that modulates the activity or stability of a polypeptide com- 
prising a polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 98 or encoded by a polynucleotide comprising a 
polynucleotide selected from SEQ I D NO: 1 to 26 and 53 to 75 or the expression, amount or stability of a polynucleotide 
10 comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75 in the preparation of a medicament for 
the treatment of malignant neoplasia and breast cancer in particular. 

[0029] Still another embodiment of the invention is a pharmaceutical composition which includes a reagent which 
specifically binds to a polynucleotide comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75 or 
a polypeptide comprising a polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 98 or encoded by a polynucleotide 
15 comprising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75, and a pharmaceutical^ acceptable carrier. 
[0030] Yet another embodiment of the invention is a pharmaceutical composition including a polynucleotide com- 
prising a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75 or encoding a polypeptide comprising a 
polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 98. 

[0031] In one embodiment, a reagent which alters the level of expression in a cell of a polynucleotide comprising a 
20 polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75 or encoding a polypeptide comprising a polypeptide 
selected from SEQ ID NO: 27 to 52 and 76 to 98, or a sequence complementary thereto, is identified by providing a 
cell, treating the cell with a test reagent, determining the level of expression in the cell of a polynucleotide comprising 
a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75 or encoding a polypeptide comprising a polypeptide 
selected from SEQ ID NO: 27 to 52 and 76 to 98 or a sequence complementary thereto, and comparing the level of 
25 expression of the polynucleotide in the treated cell with the level of expression of the polynucleotide in an untreated 
cell, wherein a change in the level of expression of the polynucleotide in the treated cell relative to the level of expression 
of the polynucleotide in the untreated cell is indicative of an agent which alters the level of expression of the polynu- 
cleotide in a cell. 

[0032] The invention further provides a pharmaceutical composition comprising a reagent identified by this method. 
30 [0033] Another embodiment of the invention is a pharmaceutical composition which includes a polypeptide compris- 
ing a polypeptide selected from SEQ ID NO: 27 to 52 and 76 to 98 or which is encoded by a polynucleotide comprising 
a polynucleotide selected from SEQ ID NO: 1 to 26 and 53 to 75. 

[0034] A further embodiment of the invention is a pharmaceutical composition comprising a polynucleotide including 
a sequence which hybridizes under stringent conditions to a polynucleotide comprising a polynucleotide selected from 
35 SEQ ID NO: 1 to 26 and 53 to 75 and encoding a polypeptide exhibiting the same biological function as given for the 
respective polynucleotide in Table 2 or 3, or encoding a polypeptide comprising a polypeptide selected from SEQ ID 
NO: 27 to 52 and 76 to 98. Pharmaceutical compositions, useful in the present invention may further include fusion 
proteins comprising a polypeptide comprising a polynucleotide selected from SEQ ID NO: 27 to 52 and 76 to 98, or a 
fragment thereof, antibodies, or antibody fragments 



BRIEF DESCRIPTION OF THE DRAWINGS 



[0035] 

Fig. 1 shows a sketch of the chromosome 1 7 with G-banding pattern and cytogenetic positions. In the blow out at 
the lower part of the figure a detailed view of the chromosomal area of the long arm of chromosome 1 7 (1 7q 
12-21.1) is provided. Each vertical rectangle depicted in medium gray, represents a gene as labeled below 
or above the individual position. The order of genes depicted in this graph has been deduced from experiments 
questioning the amplification an over expression and from public available data (e.g. UCSC, NCBI or Ensem- 
ble). 

Fig. 2 shows the same region as depicted before in Fig. 1 and a cluster representation of the individual expression 
values measured by DNA-chip hybridization. The gene representing squares are indicated by a dotted line. 
In the upper part of the cluster representation 4 tumor cell lines, of which two harbor a known HER-2/neu 
over expression (SKBR3 and AU565), are depicted with their individual expression profiles. Not only the HER- 
2/neu gene shows a clear over expression but as provided by this invention several other genes with in the 
surrounding. In the middle part of the cluster representation expression data obtained from immune histo- 
chemically characterized tumor samples are presented. Two of the depicted probes show a significant over 
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expression of genes marked by the white rectangles. For additional information and comparison expression 
profiles of several non diseased human tissues (RNAs obtained from Clontech Inc.) are provided. Closest 
relation to the expression profile of HER-2/neu positive tumors displays human brain and neural tissue. 

5 Fig. 3 provides data from DNA amplification measurements by qPCR (e.g. TaqMan). Data indicates that in several 
analyzed breast cancer cell lines harbor amplification of genes which were located in the previously described 
region (ARCHEON). Data were displayed for each gene on the x-axis and 40-Ct at the y-axis. Data were 
normalized to the expression level of GAPDH as seen in the first group of columns. 

10 Fig. 4 represents a graphical overview on the amplified regions and provides information on the length of the indi- 
vidual amplification and over expression in the analyzed tumor cell lines. The length of the amplification and 
the composition of genes has a significant impact on the nature of the cancer cell and on the responsiveness 
on certain drugs, as described elsewhere. 

15 DETAILED DESCRIPTION OF THE INVENTION 

Definitions 

[0036] "Differential expression", as used herein, refers to both quantitative as well as qualitative differences in the 
20 genes' expression patterns depending on differential development and/or tumor growth. Differentially expressed genes 
may represent "marker genes," and/or "target genes". The expression pattern of a differentially expressed gene dis- 
closed herein may be utilized as part of a prognostic or diagnostic breast cancer evaluation. Alternatively, a differentially 
expressed gene disclosed herein may be used in methods for identifying reagents and compounds and uses of these 
reagents and compounds for the treatment of breast cancer as well as methods of treatment. 
25 [0037] "Biological activity" or "bioactivity" or "activity" or "biological function", which are used interchangeably, herein 
mean an effector or antigenic function that is directly or indirectly performed by a polypeptide (whether in its native or 
denatured conformation), or by any fragment thereof in vivo or in vitro. Biological activities include but are not limited 
to binding to polypeptides, binding to other proteins or molecules, enzymatic activity, signal transduction, activity as a 
DNA binding protein, as a transcription regulator, ability to bind damaged DNA, etc. A bioactivity can be modulated by 
30 directly affecting the subject polypeptide. Alternatively, a bioactivity can be altered by modulating the level of the 
polypeptide, such as by modulating expression of the corresponding gene. 

[0038] The term "marker" or "biomarker" refers a biological molecule, e.g., a nucleic acid, peptide, hormone, etc., 
whose presence or concentration can be detected and correlated with a known condition, such as a disease state. 
[0039] "Marker gene," as used herein, refers to a differentially expressed gene which expression pattern may be 

35 utilized as part of predictive, prognostic or diagnostic malignant neoplasia or breast cancer evaluation, or which, alter- 
natively, may be used in methods for identifying compounds useful for the treatment or prevention of malignant neo- 
plasia and breast cancer in particular. A marker gene may also have the characteristics of a target gene. 
[0040] "Target gene", as used herein, refers to a differentially expressed gene involved in breast cancer in a manner 
by which modulation of the level of target gene expression or of target gene product activity may act to ameliorate 

40 symptoms of malignant neoplasia and breast cancer in particular. A target gene may also have the characteristics of 
a marker gene. 

[0041] The term "biological sample", as used herein, refers to a sample obtained from an organism or from compo- 
nents (e.g., cells) of an organism. The sample may be of any biological tissue or fluid. Frequently the sample will be 
a "clinical sample" which is a sample derived from a patient. Such samples include, but are not limited to, sputum, 
45 blood, blood cells (e.g., white cells), tissue or fine needle biopsy samples, cell-containing bodyfluids, free floating 
nucleic acids, urine, peritoneal fluid, and pleural fluid, or cells therefrom. Biological samples may also include sections 
of tissues such as frozen sections taken for histological purposes. 

[0042] By "array" or "matrix" is meant an arrangement of addressable locations or "addresses" on a device. The 
locations can be arranged in two dimensional arrays, three dimensional arrays, or other matrix formats. The number 

so of locations can range from several to at least hundreds of thousands. Most importantly, each location represents a 
totally independent reaction site. Arrays include but are not limited to nucleic acid arrays, protein arrays and antibody 
arrays. A "nucleic acid array" refers to an array containing nucleic acid probes, such as oligonucleotides, polynucle- 
otides or larger portions of genes. The nucleic acid on the array is preferably single stranded. Arrays wherein the probes 
are oligonucleotides are referred to as "oligonucleotide arrays" or "oligonucleotide chips." A "microarray," herein also 

55 refers to a "biochip" or "biological chip", an array of regions having a density of discrete regions of at least about 
100/cm 2 , and preferably at least about 1000/cm 2 . The regions in a microarray have typical dimensions, e.g., diameters, 
in the range of between about 1 0-250 urn, and are separated from other regions in the array by about the same distance. 
A "protein array" refers to an array containing polypeptide probes or protein probes which can be in native form or 
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denatured. An "antibody array" refers to an array containing antibodies which include but are not limited to monoclonal 
antibodies (e.g. from a mouse), chimeric antibodies, humanized antibodies or phage antibodies and single chain an- 
tibodies as well as fragments from antibodies. 

[0043] The term "agonist", as used herein, is meant to refer to an agent that mimics or upregulates (e.g., potentiates 
5 or supplements) the bioactivity of a protein. An agonist can be a wild-type protein or derivative thereof having at least 
one bioactivity of the wild-type protein. An agonist can also be a compound that upregulates expression of a gene or 
which increases at least one bioactivity of a protein. An agonist can also be a compound which increases the interaction 
of a polypeptide with another molecule, e.g., a target peptide or nucleic acid. 

[0044] The term "antagonist" as used herein is meant to refer to an agent that downregulates (e.g., suppresses or 
10 inhibits) at least one bioactivity of a protein. An antagonist can be a compound which inhibits or decreases the interaction 
between a protein and another molecule, e.g., a target peptide, a ligand or an enzyme substrate. An antagonist can 
also be a compound that downregulates expression of a gene or which reduces the amount of expressed protein 
present. 

[0045] "Small molecule" as used herein, is meant to refer to a composition, which has a molecular weight of less 
15 than about 5 kD and most preferably less than about 4 kD. Small molecules can be nucleic acids, peptides, polypeptides, 
peptidomimetics, carbohydrates, lipids or other organic (carbon-containing) or inorganic molecules. Many pharmaceu- 
tical companies have extensive libraries of chemical and/or biological mixtures, often fungal, bacterial, or algal extracts, 
which can be screened with any of the assays of the invention to identify compounds that modulate a bioactivity. 
[0046] The terms "modulated" or "modulation" or "regulated" or "regulation" and "differentially regulated" as used 
20 herein refer to both upregulation (i.e., activation or stimulation (e.g., by agonizing or potentiating) and down regulation 
[i.e., inhibition or suppression (e.g., by antagonizing, decreasing or inhibiting)]. 

[0047] "Transcriptional regulatory unit" refers to DNA sequences, such as initiation signals, enhancers, and promot- 
ers, which induce or control transcription of protein coding sequences with which they are operably linked. In preferred 
embodiments, transcription of one of the genes is under the control of a promoter sequence (or other transcriptional 
25 regulatory sequence) which controls the expression of the recombinant gene in a cell-type in which expression is 
intended. It will also be understood that the recombinant gene can be under the control of transcriptional regulatory 
sequences which are the same or which are different from those sequences which control transcription of the naturally 
occurring forms of the polypeptide. 

[0048] The term "derivative" refers to the chemical modification of a polypeptide sequence, or a polynucleotide se- 
30 quence. Chemical modifications of a polynucleotide sequence can include, for example, replacement of hydrogen by 
an alkyl, acyl, or amino group. A derivative polynucleotide encodes a polypeptide which retains at least one biological 
or immunological function of the natural molecule. A derivative polypeptide is one modified by glycosylation, pegylation, 
or any similar process that retains at least one biological or immunological function of the polypeptide from which it 
was derived. 

35 [0049] The term "nucleotide analog" refers to oligomers or polymers being at least in one feature different from 
naturally occurring nucleotides, oligonucleotides or polynucleotides, but exhibiting functional features of the respective 
naturally occurring nucleotides (e.g. base paring, hybridization, coding information) and that can be used for said com- 
positions. The nucleotide analogs can consist of non-naturally occurring bases or polymer backbones, examples of 
which are LNAs, PNAs and Morpholinos. The nucleotide analog has at least one molecule different from its naturally 

40 occurring counterpart or equivalent. 

[0050] "BREAST CANCER GENES" or "BREAST CANCER GENE" as used herein refers to the polynucleotides of 
SEQ ID NO: 1 to 26 and 53 to 75, as well as derivatives, fragments, analogs and homologues thereof, the polypeptides 
encoded thereby, the polypeptides of SEQ ID NO: 27 to 52 and 76 to 98 as well as derivatives, fragments, analogs 
and homologues thereof and the corresponding genomic transcription units which can be derived or identified with 

45 standard techniques well known in the art using the information disclosed in Tables 1 to 5 and Figures 1 to 4. The 
GenBank, Locuslink ID and the UniGene accession numbers of the polynucleotide sequences of the SEQ ID NO: 1 to 
26 and 53 to 75 and the polypeptides of the SEQ ID NO: 27 to 52 and 76 to 98 are shown in Table 1, the gene description, 
gene function and subcellular localization is given in Tables 2 and 3. 

[0051] The term "chromosomal region" as used herein refers to a consecutive DNA stretch on a chromosome which 
50 can be defined by cytogenetic or other genetic markers such as e.g. restriction length polymorphisms (RFLPs), single 
nucleotide polymorphisms (SNPs), expressed sequence tags (ESTs), sequence tagged sites (STSs), micro-satellites, 
variable number of tandem repeats (VNTRs) and genes. Typically a chromosomal region consists of up to 2 Megabases 
(MB), up to 4 MB, up to 6 MB, up to 8 MB, up to 10 MB, up to 20 MB or even more MB. 

[0052] The term "altered chromosomal region" or" abberant chromosomal region" refers to a structural change of 
55 the chromosomal composition and DNA sequence, which can occur by the following events: amplifications, deletions, 
inversions, insertions, translocations and/or viral integrations. A trisomy, where a given cell harbors more than two 
copies of a chromosome, is within the meaning of the term "amplification" of a chromosome or chromosomal region. 
[0053] The present invention provides polynucleotide sequences and proteins encoded thereby, as well as probes 
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derived from the polynucleotide sequences, antibodies directed to the encoded proteins, and predictive, preventive, 
diagnostic, prognostic and therapeutic uses for individuals which are at risk for or which have malignant neoplasia and 
breast cancer in particular. The sequences disclosure herein have been found to be differentially expressed in samples 
from breast cancer. 

5 [0054] The present invention is based on the identification of 43 genes that are differentially regulated (up- or down- 
regulated) in tumor biopsies of patients with clinical evidence of breast cancer. The identification of 43 human genes 
which were not known to be differentially regulated in breast cancer states and their significance for the disease is 
described in the working examples herein. The characterization of the co-expression of these genes provides newly 
identified roles in breast cancer. The gene names, the database accession numbers (GenBank and UniGene) as well 

10 as the putative or known functions of the encoded proteins and their subcellular localization are given in Tables 1 to 
4. The primer sequences used for the gene amplification are shown in Table 5. 

[0055] In either situation, detecting expression of these genes in excess or in with lower level as compared to normal 
expression provides the basis for the diagnosis of malignant neoplasia and breast cancer. Furthermore, in testing the 
efficacy of compounds during clinical trials, a decrease in the level of the expression of these genes corresponds to a 

15 return from a disease condition to a normal state, and thereby indicates a positive effect of the compound. 

[0056] Another aspect of the present invention is based on the observation that neighboring genes within defined 
genomic regions functionally interact and influence each others function directly or indirectly. A genomic region encod- 
ing functionally interacting genes that are co-amplified and co-expressed in neoplastic lesions has been defined as an 
"ARCHEON". (ARCHEON = Altered Region of Changed Chromosomal Expression Observed in Neoplasms). Chro- 

20 mosomal alterations often affect more than one gene. This is true for amplifications, duplications, insertions, integra- 
tions, inversions, translocations, and deletions. These changes can have influence on the expression level of single 
or multiple genes. Most commonly in the field of cancer diagnostics and treatment the changes of expression levels 
have been investigated for single, putative relevant target genes such as MLVI2 (5p14), NRASL3 (6p12), EGFR (7p12), 
c-myc (8q23), Cyclin D1 (11q13), IGF1R(15q25), HER-2/neu (1 7q12), PCNA(20q12). However, the altered expression 

25 level and interaction of multiple (i.e. more than two) genes within one genomic region with each other has not been 
addressed. Genes of an ARCHEON form gene clusters with tissue specific expression patterns. The mode of interaction 
of individual genes within such a gene cluster suspected to represent an ARCHEON can be either protein-protein or 
protein-nucleic acid interaction, which may be illustrated but not limited by the following examples: ARCHEON gene 
interaction may be in the same signal transduction pathway, may be receptor to ligand binding, receptor kinase and 

30 SH2 or SH3 binding, transcription factor to promoter binding, nuclear hormone receptor to transcription factor binding, 
phosphogroup donation (e.g. kinases) and acceptance (e.g. phosphoprotein), mRNA stabilizing protein binding and 
transcriptional processes. The individual activity and specificity of a pair genes and or the proteins encoded thereby 
or of a group of such in a higher order, may be readily deduced from literature, published or deposited within public 
databases by the skilled person. However in the context of an ARCHEON the interaction of members being part of an 

35 ARCHEON will potentiate, exaggerate or reduce their singular functions. This interaction is of importance in defined 
normal tissues in which they are normally co-expressed. Therefore, these clusters have been commonly conserved 
during evolution. The aberrant expression of members of these ARCHEON in neoplastic lesions, however, (especially 
within tissues in which they are normally not expressed) has influence on tumor characteristics such as growth, inva- 
siveness and drug responsiveness. Due to the interaction of these neighboring genes it is of importance to determine 

40 the members of the ARCHEON which are involved in the deregulation events. In this regard amplification and deletion 
events in neoplastic lesions are of special interest. 

[0057] The invention relates to a method for the detection of chromosomal alterations by (a) determining the relative 
mRNA abundance of individual mRNA species or (b) determining the copy number of one or more chromosomal region 
(s) by quantitative PCR. In one embodiment information on the genomic organization and spatial regulation of chro- 
45 mosomal regions is assessed by bioinformatic analysis of the sequence information of the human genome (UCSC, 
NCBI) and then combined with RNA expression data from GeneChip™ DNA-Arrays (AfFymetrix) and/or quantitative 
PCR (TaqMan) from RNA-samples or genomic DNA. 

[0058] In a further embodiment the functional relationship of genes located on a chromosomal region which is altered 
(amplified or deleted) is established. The altered chromosomal region is defined as an ARCHEON if genes located on 

50 that region functionally interact. 

[0059] The 17q 12 locus was investigated as one model system, harboring the HER-2/neu gene. By establishing a 
high-resolution assay to detect amplification events in neighboring genes, 43 genes that are commonly co-amplified 
in breast cancer cell lines and patient samples were identified. By gene array technologies and immunological methods 
their co-overexpression in tumor samples was demonstrated. Surprisingly, by clustering tissue samples with HER- 

55 2/neu positive Tumor samples, it was found that the expression pattern of this larger genomic region (consisting of 43 
genes) is very similar to control brain tissue. HER-2/neu negative breast tumor tissue did not show a similar expression 
pattern. Indeed, some of the genes within these cluster are important for neural development (HER-2/neu, THRA) in 
mouse model systems or are described to be expressed in neural cells (NeuroD2). Moreover, by searching similar 



8 



EP 1 365 034 A2 



gene combinations in the human and rodent genome additional homologous chromosomal regions on chromosome 
3p21 and 12q13 harboring several isoforms of the respective genes (see below) were found. There was a strong 
evidence for multiple interactions between the 43 candidate genes, as being part of identical pathways (HER-2, neu, 
GRB7, CrkRS, CDC6), influencing the expression of each other (HER-2/neu, THRA, RARA), interacting with each 
5 other (PPARGBP, THRA, RARA, NR1 D1 or HER-2/neu, GRB7) or expressed in defined tissues (CACNB1 , PPARGBP, 
etc.). Interestingly, the genomic regions of the ARCHEONs that were identified are amplified in acquired Tamoxifen 
resistance of HER-2/neu negative cells (MCF7), which are normally sensitive to Tamoxifen treatment [Achuthan et al., 
2001,(2)]. 

[0060] Moreover, altered responsiveness to treatment due to the alterations of the genes within these ARCHEONs 

10 was observed. Surprisingly, genes within the ARCHEONs are of importance even in the absence of HER-2/neu homo- 
logues. Some of the genes within the ARCHEONs, do not only serve as marker genes for prognostic purposes, but 
have already been known as targets for therapeutic intervention. For example TOP2 alpha is a target of anthracyclins. 
THRA and RARA can be targeted by hormones and hormone analogs (e.g. T3, rT3, RA). Due to their high affinity 
binding sites and available screening assays (reporter assays based on their transcriptional potential) the hormone 

15 receptors which are shown to be linked to neoplastic pathophysiology for the first time herein are ideal targets for drug 
screening and treatment of malignant neoplasia and breast cancer in particular. In this regard it is essential to know 
which members of the ARCHEON are altered in the neoplastic lesions. Particularly it is important to know the nature, 
number and extent to which the ARCHEON genes are amplified or deleted. The ARCHEONs are flanked by similar, 
endogenous retroviruses (e.g. HERV-K= "human endogenous retrovirus"), some of which are activated in breast can- 

20 cer. These viruses may have also been involved in the evolutionary duplication of the ARCHEONs. 

[0061] The analysis of the 17q12 region proved data obtained by IHC and identified several additional genes being 
co-amplified with the HER-2/neu gene. Comparative Analysis of RNA-based quantitative RT-PCR (TaqMan) with DNA- 
based qPCR from tumor cell lines identified the same amplified region. Genes at the 17q11.2 -21. region are offered 
by way of illustration not by way of limitation. A graphical display of the described chromosomal region is provided in 

25 Figure 1. 

Biological relevance of the genes which are part of the 17q12 ARCHEON 
MLN50 

[0062] By differential screening of cDNAs from breast cancer-derived metastatic axillary lymph nodes, TRAF4 and 
3 other novel genes (MLN51 , MLN62, MLN64) were identified that are overexpressed in breast cancer [Tomasetto et 
al., 1995, (3)]. One gene, which they designated MLN50, was mapped to 17q1 1-q21.3 by radioactive in situ hybridi- 
zation. In breast cancer cell lines, overexpression of the 4 kb MLN50 mRNA was correlated with amplification of the 

35 gene and with amplification and overexpression of ERBB2, which maps to the same region. The authors suggested 
that the 2 genes belong to the same amplicon. Amplification of chromosomal region 17q11-q21 is one of the most 
common events occurring in human breast cancers. They reported that the predicted 261 -amino acid MLN50 protein 
contains an N-terminal LIM domain and a C-terminal SH3 domain. They renamed the protein LASP1 , for 'LIM and SH3 
protein.' Northern blot analysis revealed that LASP1 mRNA was expressed at a basal level in all normal tissues ex- 

40 amined and overexpressed in 8% of primary breast cancers. In most of these cancers, LASP1 and ERBB2 were si- 
multaneously overexpressed. 

MLLT6 

45 [0063] The MLLT6 (AF17) gene encodes a protein of 1,093 amino acids, containing a leucine-zipper dimerization 
motif located 3-prime of the fusion point and a cysteine-rich domain at the end terminus. AF17 was found to contain 
stretches of amino acids previously associated with domains involved in transcriptional repression or activation. 
[0064] Chromosome translocations involving band 1 1q23 are associated with approximately 10% of patients with 
acute lymphoblastic leukemia (ALL) and more than 5% of patients with acute myeloid leukemia (AML). The gene at 

50 11q23 involved in the translocations is variously designated ALL1, HRX, MLL, and TAX1. The partner gene in one of 
the rarer translocations, t(11;17)(q23;q21), designated MLLT6 on 17q12. 

ZNF144 (Mel1 8) 

55 [0065] Mel 18 cDNA encodes a novel cys-rich zinc finger motif. The gene is expressed strongly in most tumor cell 
lines, but its normal tissue expression was limited to cells of neural origin and was especially abundant in fetal neural 
cells. It belongs to a RING-finger motif family which includes BMI1 . The MEL18/BMI1 gene family represents a mam- 
malian homolog of the Drosophila 'polycomb' gene group, thereby belonging to a memory mechanism involved in 
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maintaining the the expression pattern of key regulatory factors such as Hox genes. Bmi1, Mel 18 and M33 genes, as 
representative examples of mouse Pc-G genes. Common phenotypes observed in knockout mice mutant for each of 
these genes indicate an important role for Pc-G genes not only in regulation of Hox gene expression and axial skeleton 
development but also in control of proliferation and survival of haematopoietic cell lineages. This is in line with the 

5 observed proliferative deregulation observed in lymphoblastic leukemia. The MEL18 gene is conserved among verte- 
brates. Its mRNA is expressed at high levels in placenta, lung, and kidney, and at lower levels in liver, pancreas, and 
skeletal muscle. Interestingly, cervical and lumbo-sacral-HOX gene expression is altered in several primary breast 
cancers with respect to normal breast tissue with the HoxB gene cluster being present on 17q distal to the 17q12 locus. 
Moreover, delay of differentiation with persistent nests of proliferating cells was found in endothelial cells cocultured 

10 with HOXB7-transduced SkBr3 cells, which exhibit a 17q 12 amplification. Tumorigenicity of these cells has been eval- 
uated in vivo. Xenograft in athymic nude mice showed that SkBr3/HOXB7 cells developed tumors with an increased 
number of blood vessels, either irradiated or not, whereas parental SkBr3 cells did not show any tumor take unless 
mice were sublethally irradiated. As part of this invention, we have found MEL18 to be overexpressed specifically in 
tumors bearing Her-2/neu gene amplification, which can be critical for Hox expression. 

15 

PHOSPHATIDYLINOSITOL-4-PHOSPHATE 5-KINASE, TYPE II, BETA; PIP5K2B 

[0066] Phosphoinositide kinases play central roles in signal transduction. Phosphatidylinositol-4-phosphate 5-kinas- 
es (PIP5Ks) phosphorylate phosphatidylinositol 4-phosphate, giving rise to phosphatidylinositol 4,5-bisphosphate. The 

20 PIP5K enzymes exist as multiple isoforms that have various immunoreactivities, kinetic properties, and molecular 
masses. They are unique in that they possess almost no homology to the kinase motifs present in other phosphatidyli- 
nositol, protein, and lipid kinases. By screening a human fetal brain cDNA library with the PIP5K2B EST the full length 
gene could be isolated. The deduced 416-amino acid protein is 78% identical to PIP5K2A. Using SDS-PAGE, the 
authors estimated that bacterially expressed PIP5K2B has a molecular mass of 47 kD. Northern blot analysis detected 

25 a 6.3-kb PIP5K2B transcript which was abundantly expressed in several human tissues. PIP5K2B interacts specifically 
with the juxtamembrane region of the p55 TNF receptor (TNFR1) and PIP5K2B activity is increased in mammalian 
cells by treatment with TNF-alpha. A modeled complex with membrane-bound substrate and ATP shows how a phos- 
phoinositide kinase can phosphorylate its substrate in situ at the membrane interface. The substrate-binding site is 
open on 1 side, consistent with dual specificity for phosphatidylinositol 3- and 5-phosphates. Although the amino acid 

30 sequence of PIP5K2A does not show homology to known kinases, recombinant PIP5K2A exhibited kinase activity. 
PIP5K2A contains a putative Src homology 3 (SH3) domain-binding sequence. Overexpression of mouse PIP5K1 B in 
COS7 cells induced an increase in short actin fibers and a decrease in actin stress fibers. 

TEM7 

35 

[0067] Using serial analysis of gene expression (SAGE) a partial cDNAs corresponding to several tumor endothelial 
markers (TEMs) that displayed elevated expression during tumor angiogenesis could be identified. Among the genes 
identified was TEM7. Using database searches and 5-prime RACE the entire TEM7 coding region, which encodes a 
500-amino acid type I transmembrane protein, has been described.. The extracellular region of TEM7 contains a plexin- 

40 like domain and has weak homology to the ECM protein nidogen. The function of these domains, which are usually 
found in secreted and extracellular matrix molecules, is unknown. Nidogen itself belongs to the entactin protein family 
and helps to determine pathways of migrating axons by switching from circumferential to longitudinal migration. Entactin 
is involved in cell migration, as it promotes trophoblast outgrowth through a mechanism mediated by the RGD recog- 
nition site, and plays an important role during invasion of the endometrial basement membrane at implantation. As 

45 entactin promotes thymocyte adhesion but affects thymocyte migration only marginally, it is suggested that entactin 
may plays a role in thymocyte localization during T cell development. 

[0068] In situ hybridization analysis of human colorectal cancer demonstrated that TEM7 was expressed clearly in 
the endothelial cells of the tumor stroma but not in the endothelial cells of normal colonic tissue. Using in situ hybridi- 
zation to assay expression in various normal adult mouse tissues, they observed that TEM7 was largely undetectable 
50 in mouse tissues or tumors, but was abundantly expressed in mouse brain. 

ZNFN1A3 

[0069] By screening a B-cell cDNA library with a mouse Aiolos N-terminal cDNA probe, a cDNA encoding human 
55 Aiolos, or ZNFN1A3, was obtained. The deduced 509-amino acid protein, which is 86% identical to its mouse coun- 
terpart, has 4 DNA-binding zinc fingers in its N terminus and 2 zinc fingers that mediate protein dimerization in its C 
terminus. These domains are 100% and 96% homologous to the corresponding domains in the mouse protein, respec- 
tively. Northern blot analysis revealed strong expression of a major 11.0- and a minor 4.4-kb ZNFN1A3 transcript in 



10 



EP 1 365 034 A2 



peripheral blood leukocytes, spleen, and thymus, with lower expression in liver, small intestine, and lung. 
[0070] Ikaros (ZNFN1A1), a hemopoietic zinc finger DNA-binding protein, is a central regulator of lymphoid differ- 
entiation and is implicated in leukemogenesis. The execution of normal function of Ikaros requires sequence-specific 
DNA binding, transactivation, and dimerization domains. Mice with a mutation in a related zinc finger protein, Aiolos, 

5 are prone to B-cell lymphoma. In chemically induced murine lymphomas allelic losses on markers surrounding the 
Znfhial gene were detected in 27% of the tumors analyzed. Moreover specific Ikaros expression was in primary mouse 
hormone-producing anterior pituitary cells and substantial for Fibroblast growth factor receptor 4 (FGFR4) expression, 
which itself is implicated in a multitude of endocrine cell hormonal and proliferative properties with FGFR4 being dif- 
ferentially expressed in normal and neoplastic pituitary. Moreover Ikaros binds to chromatin remodelling complexes 

10 containing SWI/SNF proteins, which antagonize Polycomb function. Intetrestingly at the telomeric end of the disclosed 
ARCHEON the SWI/SNF complex member SMARCE1 (= SWI/SNF-related, matrix-associated, actin-dependent reg- 
ulators of chromatin) is located and part of the described amplification. Due to the related binding specificities of Ikaros 
and Palindrom Binding Protein (PBP) it is suggestive, that ZNFN1A3 is able to regulate the Her-2/neu enhancer. 

15 PPP1R1B 

[0071] Midbrain dopaminergic neurons play a critical role in multiple brain functions, and abnormal signaling through 
dopaminergic pathways has been implicated in several major neurologic and psychiatric disorders. One well-studied 
target for the actions of dopamine is DARPP32. In the densely dopamine- and glutamate-innervated rat caudate- 
20 putamen, DARPP32 is expressed in medium-sized spiny neurons that also express dopamine D1 receptors. The func- 
tion of DARPP32 seems to be regulated by receptor stimulation. Both dopaminergic and glutamatergic (NMDA) receptor 
stimulation regulate the extent of DARPP32 phosphorylation, but in opposite directions. 

[0072] The human DARPP32 was isolated from a striatal cDNA library. The 204-amino acid DARPP32 protein shares 
88% and 85% sequence identity, respectively, with bovine and rat DARPP32 proteins. The DARPP32 sequence is 

25 particularly conserved through the N terminus, which represents the active portion of the protein. Northern blot analysis 
demonstrated that the 2.1-kb DARPP32 mRNA is more highly expressed in human caudate than in cortex. In situ 
hybridization to postmortem human brain showed a low level of DARPP32 expression in all neocortical layers, with 
the strongest hybridization in the superficial layers. CDK5 phosphorylated DARPP32 in vitro and in intact brain cells. 
Phospho-thr75 DARPP32 inhibits PKA in vitro by a competitive mechanism. Decreasing phospho-thr75 DARPP32 in 

30 striatal cells either by a CDK5-specific inhibitor or by using genetically altered mice resulted in increased dopamine- 
induced phosphorylation of PKA substrates and augmented peak voltage-gated calcium currents. Thus, DARPP32 is 
a bifunctional signal transduction molecule which, by distinct mechanisms, controls a serine/threonine kinase and a 
serine/threonine phosphatase. 

[0073] DARPP32 and t-DARPP are overexpressed in gastric cancers. It's suggested that overexpression of these 2 
35 proteins in gastric cancers may provide an important survival advantage to neoplastic cells. It could be demonstrated 
that Darpp32 is an obligate intermediate in progesterone-facilitated sexual receptivity in female rats and mice. The 
facilitative effect of progesterone on sexual receptivity in female rats was blocked by antisense oligonucleotides to 
Darpp32. Homozygous mice carrying a null mutation for the Darpp32 gene exhibited minimal levels of progesterone- 
facilitated sexual receptivity when compared to their wildtype littermates, and progesterone significantly increased 
40 hypothalamic cAMP levels and cAMP-dependent protein kinase activity. 

CACNB1 

[0074] In 1991 a cDNA clone encoding a protein with high homology to the beta subunit of the rabbit skeletal muscle 
45 dihydropyridine-sensitive calcium channel from a rat brain cDNA library [Pragnell et al., 1991 , (4)]. This rat brain beta- 
subunit cDNA hybridized to a 3.4-kb message that was expressed in high levels in the cerebral hemispheres and 
hippocampus and much lower levels in cerebellum. The open reading frame encodes 597 amino acids with a predicted 
mass of 65,679 Da which is 82% homologous with the skeletal muscle beta subunit. The corresponding human beta- 
subunit gene was localized to chromosome 17 by analysis of somatic cell hybrids. The authors suggested that the 
50 encoded brain beta subunit, which has a primary structure highly similar to its isoform in skeletal muscle, may have a 
comparable role as an integral regulatory component of a neuronal calcium channel. 

RPL19 

55 [0075] The ribosome is the only organelle conserved between prokaryotes and eukaryotes. In eukaryotes, this or- 
ganelle consists of a 60S large subunit and a 40S small subunit. The mammalian ribosome contains 4 species of RNA 
and approximately 80 different ribosomal proteins, most of which appear to be present in equimolar amounts. In mam- 
malian cells, ribosomal proteins can account for up to 1 5% of the total cellular protein, and the expression of the different 
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ribosomal protein genes, which can account for up to 7 to 9% of the total cellular mRNAs, is coordinately regulated to 
meet the cell's varying requirements for protein synthesis. The mammalian ribosomal protein genes are members of 
multigene families, most of which are composed of multiple processed pseudogenes and a single functional intron- 
containing gene. The presence of multiple pseudogenes hampered the isolation and study of the functional ribosomal 

5 protein genes. By study of somatic cell hybrids, it has been elucidated that DNA sequences complementary to 6 mam- 
malian ribosomal protein cDNAs could be assigned to chromosomes 5, 8, and 17. Ten fragments mapped to 3 chro- 
mosomes [Nakamichi et al., 1 986, (5)]. These are probably a mixture of functional (expressed) genes and pseudogenes. 
One that maps to 5q23-q33 rescues Chinese hamster emetine-resistance mutations in interspecies hybrids and is 
therefore the transcriptionally active RPS14 gene. In 1989 a PCR-based strategy for the detection of intron-containing 

10 genes in the presence of multiple pseudogenes was described. This technique was used to identify the intron-containing 
PCR products of 7 human ribosomal protein genes and to map their chromosomal locations by hybridization to human/ 
rodent somatic cell hybrids [Feo et al., 1992, (6)]. All 7 ribosomal protein genes were found to be on different chromo- 
somes: RPL19 on 17pl2-qll;RPL30 on 8; RPL35Aon 18; RPL36A on 14; RPS6 on 9pter-p13; RPS11 on 19cen-qter; 
and RPS17 on 11pter-p13. These are also different sites from the chromosomal location of previously mapped ribos- 

15 omal protein genes S14 on chromosome 5, S4 on Xq and Yp, and RP117A on 9q3-q34. By fluorescence in situ hy- 
bridization the position of the RPL19 gene was mapped to 17q11 [Davies et al., 1989, (7)]. 

PPARBP, PBP, CRSP1, CRSP200, TRIP2, TRAP220, RB18A, DRIP230 



20 [0076] The thyroid hormone receptors (TRs) are hormone-dependent transcription factors that regulate expression 
of a variety of specific target genes. They must specifically interact with a number of proteins as they progress from 
their initial translation and nuclear translocation to heterodimerization with retinoid X receptors (RXRs), functional in- 
teractions with other transcription factors and the basic transcriptional apparatus, and eventually, degradation. To help 
elucidate the mechanisms that underlie the transcriptional effects and other potential functions of TRs, the yeast inter- 

25 action trap, a version of the yeast 2-hybrid system, was used to identify proteins that specifically interact with the ligand- 
binding domain of rat TR-beta-1 (THRB) [Leeet al., 1995, (8)]. The authors isolated HeLacell cDNAs encoding several 
different TR-interacting proteins (TRIPs), including TRIP2. TRIP2 interacted with rat Thrb only in the presence of thyroid 
hormone. It showed a ligand-independent interaction with RXR-alpha, but did not interact with the glucocorticoid re- 
ceptor (NR3C1) under any condition. By immunoscreening a human B-lymphoma cell cDNA expression library with 

30 the anti-p53 monoclonal antibody PAb1801, PPARBP was identified, which was called RB18A for 'recognized by 
PAb1801 monoclonal antibody' [Drane et al., 1997, (9)]. The predicted 1,566-amino acid RB 18A protein contains 
several potential nuclear localization signals, 1 3 potential N-glycosylation sites, and a high number of potential phos- 
phorylation sites. Despite sharing common antigenic determinants with p53, RB18A does not show significant nucle- 
otide or amino acid sequence similarity with p53. Whereas the calculated molecular mass of RB18A is 166 kD, the 

35 apparent mass of recombinant RB18A was 205 kD by SDS-PAGE analysis. The authors demonstrated that RB18A 
shares functional properties with p53, including DNA binding, p53 binding, and self-oligomerization. Furthermore, 
RB18A was able to activate the sequence-specific binding of p53 to DNA, which was induced through an unstable 
interaction between both proteins. Northern blot analysis of human tissues detected an 8.5-kb RB18A transcript in all 
tissues examined except kidney, with highest expression in heart. Moreover mouse Pparbp, which was called Pbp for 

40 'Ppar-binding protein,' as a protein that interacts with the Ppar-gamma (PPARG) ligand-binding domain in a yeast 
2-hybrid system was identified [Zhu et al., 1997, (10)]. The authors found that Pbp also binds to PPAR-alpha (PPARA), 
RAR-alpha (RARA), RXR, and TR-beta-1 in vitro. The binding of Pbp to these receptors increased in the presence of 
specific ligands. Deletion of the last 12 amino acids from the C terminus of PPAR-gamma resulted in the abolition of 
interaction between Pbp and PPAR-gamma. Pbp modestly increased the transcriptional activity of PPAR-gamma, and 

45 a truncated form of Pbp acted as a dominant-negative repressor, suggesting that Pbp is a genuine transcriptional co- 
activator for PPAR. The predicted 1 ,560-amino acid Pbp protein contains 2 LXXLL motifs, which are considered nec- 
essary and sufficient for the binding of several co-activators to nuclear receptors. Northern blot analysis detected Pbp 
expression in all mouse tissues examined, with higher levels in liver, kidney, lung, and testis. In situ hybridization 
showed that Pbp is expressed during mouse ontogeny, suggesting a possible role for Pbp in cellular proliferation and 

so differentiation. In adult mouse, in situ hybridization detected Pbp expression in liver, bronchial epithelium in the lung, 
intestinal mucosa, kidney cortex, thymic cortex, splenic follicles, and seminiferous epithelium in testis. Lateron PPARBP 
was identified, which was called TRAP220, from an immunopurified TR-alpha (THRA)-TRAP complex [Yuan et al., 
1998, (11)]. The authors cloned Jurkat cell cDNAs encoding TRAP220. The predicted 1,581-amino acid TRAP220 
protein contains LXXLL domains, which are found in other nuclear receptor-interacting proteins. TRAP220 is nearly 

55 identical to RB18A , with these proteins differing primarily by an extended N terminus on TRAP220. In the absence of 
TR-alpha, TRAP220 appears to reside in a single complex with other TRAPs. TRAP220 showed a direct ligand-de- 
pendent interaction with TR-alpha, which was mediated through the C terminus of TR-alpha and, at least in part, the 
LXXLL domains of TRAP220. TRAP220 also interacted with other nuclear receptors, including vitamin D receptor, 
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RARA, RXRA, PPARA, PPARG, and estrogen receptor-alpha (ESR1; 133430), in a ligand-dependent manner. 
TRAP220 moderately stimulated human TR-alpha-mediated transcription in transfected cells, whereas a fragment 
containing the LXXLL motifs acted as a dominant-negative inhibitor of nuclear receptor-mediated transcription both in 
transfected cells and in cell-free transcription systems. Further studies indicated that TRAP220 plays a major role in 

5 anchoring other TRAPs to TR-alpha during the function of the TR-alpha-TRAP complex and that TRAP220 may be a 
global co-activator for the nuclear receptor superfamily. PBP, a nuclear receptor co-activator, interacts with estrogen 
receptor-alpha (ESR1) in the absence of estrogen. This interaction was enhanced in the presence of estrogen, but 
was reduced in the presence of the anti-estrogen Tamoxifen. Transfection of PBP into cultured cells resulted in en- 
hancement of estrogen-dependent transcription, indicating that PBP serves as a co-activator in estrogen receptor 

10 signaling. To examine whether overexpression of PBP plays a role in breast cancer because of its co-activator function 
in estrogen receptor signaling, the levels of PBP expression in breast tumors was determined [Zhu et al., 1999, (12)]. 
High levels of PBP expression were detected in approximately 50% of primary breast cancers and breast cancer cell 
lines by ribonuclease protection analysis, in situ hybridization, and immunoperoxidase staining. By using FISH, the 
authors mapped the PBP gene to 17q12, a region that is amplified in some breast cancers. They found PBP gene 

15 amplification in approximately 24% (6 of 25) of breast tumors and approximately 30% (2 of 6) of breast cancer cell 
lines, implying that PBP gene overexpression can occur independent of gene amplification. They determined that the 
PBP gene comprises 17exons that together span more than 37 kb. Theirfindings, in particular PBP gene amplification, 
suggested that PBP, by its ability to function as an estrogen receptor-alpha co-activator, may play a role in mammary 
epithelial differentiation and in breast carcinogenesis. 

20 

NEUROD2 

[0077] Basic helix-loop-helix (bHLH) proteins are transcription factors involved in determining cell type during devel- 
opment. In 1995 a bHLH protein was described, termed NeuroD (for 'neurogenic differentiation'), that functions during 

25 neurogenesis [Lee et al., 1995, (13)]. The human NEUROD gene maps to chromosome 2q32. The cloning and char- 
acterization of 2 additional NEUROD genes, NEUROD2 and NEUROD3 was described in 1996 [McCormick et al., 
1996, (14)]. Sequences for the mouse and human homologues were presented. NEUROD2 shows a high degree of 
homology to the bHLH region of NEUROD, whereas NEUROD3 is more distantly related. The authors found that mouse 
neuroD2 was initially expressed at embryonic day 11, with persistent expression in the adult nervous system. Similar 

30 to neuroD, neuroD2 appears to mediate neuronal differentiation. The human NEUROD2 was mapped to 17q12 by 
fluorescence in situ hybridization and the mouse homologue to chromosome 11 [Tamimi et al., 1997, (15)]. 

TELETHONIN 

35 [0078] Telethonin is a sarcomeric protein of 19 kD found exclusively in striated and cardiac muscle It appears to be 
localized to the Z disc of adult skeletal muscle and cultured myocytes. Telethonin is a substrate of titin, which acts as 
a molecular 'ruler' for the assembly of the sarcomere by providing spatially defined binding sites for other sarcomeric 
proteins. After activation by phosphorylation and calcium/calmodulin binding, titin phosphorylates the C-terminal do- 
main of telethonin in early differentiating myocytes. The telethonin gene has been mapped to 17q12, adjacent to the 

40 phenylethanolamine N-methyltransferase gene [Valle et al., 1997, (16)]. 

PENT, PNMT 

[0079] Phenylethanolamine N-methyltransferase catalyzes the synthesis of epinephrine from norepinephrine, the 
45 last step of catecholamine biosynthesis. The cDNA clone was first isolated in 1998 for bovine adrenal medulla PNMT 
using mixed oligodeoxyribonucleotide probes whose synthesis was based on the partial amino acid sequence of tryptic 
peptides from the bovine enzyme [Kaneda et al., 1988, (17)]. Using a bovine cDNA as a probe, the authors screened 
a human pheochromocytoma cDNA library and isolated a cDNA clone with an insert of about 1.0 kb, which contained 
a complete coding region of the enzyme. Northern blot analysis of human pheochromocytoma polyadenylated RNA 
so using this cDNA insert as the probe demonstrated a single RNA species of about 1 ,000 nucleotides, suggesting that 
this clone is a full-length cDNA. The nucleotide sequence showed that human PNMT has 282 amino acid residues 
with a predicted molecular weight of 30,853, including the initial methionine. The amino acid sequence was 88% ho- 
mologous to that of bovine enzyme. The PNMT gene was found to consist of 3 exons and 2 introns spanning about 
2,100 basepairs. It was demonstrated that in transgenic mice the gene is expressed in adrenal medulla and retina. A 
55 hybrid gene consisting of 2 kb of the PNMT 5-prime-flanking region fused to the simian virus 40 early region also 
resulted in tumor antigen mRNA expression in adrenal glands and eyes; furthermore, immunocytochemistry showed 
that the tumor antigen was localized in nuclei of adrenal medullary cells and cells of the inner nuclear cell layer of the 
retina, both prominent sites of epinephrine synthesis. The results indicate that the enhancer(s) for appropriate expres- 
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sion of the gene in these cell types are in the 2-kb 5-prime-flanking region of the gene. 

[0080] Kaneda et al., 1988 (17), assigned the human PNMT gene to chromosome 17 by Southern blot analysis of 
DNA from mouse-human somatic cell hybrids. In 1992 the localization was narrowed down to 17q21-q22 by linkage 
analysis using RFLPs related to the PNMT gene and several 17q DNA markers [Hoehe et al., 1992, (18)]. The findings 
5 are of interest in light of the description of a genetic locus associated with blood pressure regulation in the stroke-prone 
spontaneously hypertensive rat (SHR-SP) on rat chromosome 10 in a conserved linkage synteny group corresponding 
to human chromosome 17q22-q24. See essential hypertension . 

MGC9753 

w 

[0081] This gene maps on chromosome 17, at 17q12 according to RefSeq. It is expressed at very high level. It is 
defined by cDNA clones and produces, by alternative splicing, 7 different transcripts can be obtained (SEQ ID NO:60 
to 66 and 83 to 89 .Table 1), altogether encoding 7 different protein isoforms. Of specific interest is the putatively 
secreted isoform g, encoded by a mRNA of 2.55 kb. It's premessenger covers 16.94 kb on the genome. It has a very 
15 long 3' UTR. . The protein (226 aa, MW 24.6 kDa, pi 8.5) contains no Pfam motif. The MGC9753 gene produces, by 
alternative splicing, 7 types of transcripts, predicted to encode 7 distinct proteins. It contains 13 confirmed introns, 10 
of which are alternative. Comparison to the genome sequence shows that 11 introns follow the consensual [gt-ag] rule, 
1 is atypical with good support [tg_cg]. The six most abundant isoforms are designated by a) to i) and code for proteins 
as follows: 

20 

a) This mRNA is 3.03 kb long, its premessenger covers 16.95 kb on the genome. It has a very long 3' UTR. The 
protein (1 90 aa, MW21 .5 kDa, pi 7.2) contains no Pfam motif. It is predicted to localise in the endoplasmic reticulum. 

c) This mRNA is 1.17 kb long, its premessenger covers 16.93 kb on the genome. It may be incomplete at the N 
25 terminus. The protein (368 aa, MW 41 .5 kDa, pi 7.3) contains no Pfam motif. 

d) This mRNA is 3.17 kb long, its premessenger covers 16.94 kb on the genome. It has a very long 3' UTR and 
5'p UTR. . The protein (190 aa, MW 21.5 kDa, pi 7.2) contains no Pfam motif. It is predicted to localise in the 
endoplasmic reticulum. 

g) This mRNA is 2.55 kb long, its premessenger covers 16.94 kb on the genome. It has a very long 3' UTR. . The 
protein (226 aa, MW 24.6 kDa, pi 8.5) contains no Pfam motif. It is predicted to be secreted. 

h) This mRNA is 2.68 kb long, its premessenger covers 16.94 kb on the genome. It has a very long 3' UTR. . The 
35 protein (320 aa, MW 36.5 kDa, pi 6.8) contains no Pfam motif. It is predicted to localise in the endoplasmic reticulum. 

i) This mRNA is 2.34 kb long, its premessenger covers 16.94 kb on the genome. It may be incomplete at the N 
terminus. It has a very long 3' UTR. . The protein (217 aa, MW 24.4 kDa, pi 5.9) contains no Pfam motif. 

40 [0082] The MCG9753 gene may be homologue to the CAB2 gene located on chromosome 17q12. The CAB2, a 
human homologue of the yeast COS16 required for the repair of DNA double-strand breaks was cloned. Autofluores- 
cence analysis of cells transfected with its GFP fusion protein demonstrated that CAB2 translocates into vesicles, 
suggesting that overexpression of CAB2 may decrease intercellular Mn-(2+) by accumulating it in the vesicles, in the 
same way as yeast. 

Her-2/neu, ERBB2, NGL, TKR1 

[0083] The oncogene originally called NEU was derived from rat neuro/glioblastoma cell lines. It encodes a tumor 
antigen, p185, which is serologically related to EGFR, the epidermal growth factor receptor. EGFR maps to chromo- 

50 some 7. In1985 it was found, that the human homologue, which they designated NGL (to avoid confusion with neu- 
raminidase, which is also symbolized NEU), maps to 17q12-q22 by in situ hybridization and to 17q21-qter in somatic 
cell hybrids [Yang-Feng et al., 1985, (19)]. Thus, the SRO is 17q21-q22. Moreover, in 1985 a potential cell surface 
receptor of the tyrosine kinase gene family was identified and characterized by cloning the gene [Coussens et al., 
1 985, (20)]. Its primary sequence is very similar to that of the human epidermal growth factor receptor. Because of the 

55 seemingly close relationship to the human EGF receptor, the authors called the gene HER2. By Southern blot analysis 
of somatic cell hybrid DNA and by in situ hybridization, the gene was assigned to 1 7q21 -q22. This chromosomal location 
of the gene is coincident with the NEU oncogene, which suggests that the 2 genes may in fact be the same; indeed, 
sequencing indicates that they are identical. In1988 a correlation between overexpression of NEU protein and the 
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large-cell, comedo growth type of ductal carcinoma was found [van de Vijver et al., 1988, (21)]. The authors found no 
correlation, however, with lymph-node status or tumor recurrence. The role of HER2/NEU in breast and ovarian cancer 
was described in 1989, which together account for one-third of all cancers in women and approximately one-quarter 
of cancer-related deaths in females [Slamon et al., 1989, (22)]. 

5 [0084] An ERBB-related gene that is distinct from the ERBB gene, called ERBB1 was found in 1985. ERBB2 was 
not amplified in vulva carcinoma cells with EGFR amplification and did not react with EGF receptor mRNA. About 
30-fold amplification of ERBB2 was observed in a human adenocarcinoma of the salivary gland. By chromosome 
sorting combined with velocity sedimentation and Southern hybridization, the ERBB2 gene was assigned to chromo- 
some 17 [Fukushige et al., 1986, (23)]. By hybridization to sorted chromosomes and to metaphase spreads with a 

10 genomic probe, they mapped the ERBB2 locus to 1 7q21 . This is the chromosome 1 7 breakpoint in acute promyelocytic 
leukemia (APL). Furthermore, they observed amplification and elevated expression of the ERBB2 gene in a gastric 
cancer cell line. Antibodies against a synthetic peptide corresponding to 14 amino acid residues at the COOH-terminus 
of a protein deduced from the ERBB2 nucleotide sequence were raised in 1986. With these antibodies, the ERBB2 
gene product from adenocarcinoma cells was precipitated and demonstrated to be a 1 85-kD glycoprotein with tyrosine 

15 kinase activity. A cDNA probe for ERBB2 and by in situ hybridization to APL cells with a 1 5; 1 7 chromosome translocation 
located the gene to the proximal side of the breakpoint [Kaneko et al., 1987, (24)]. The authors suggested that both 
the gene and the breakpoint are located in band 17q21.1 and, further, that the ERBB2 gene is involved in the devel- 
opment of leukemia. In 1987 experiments indicated that NEU and HER2 are both the same as ERBB2 [Di Fiore et al., 
1987, (25)]. The authors demonstrated that overexpression alone can convert the gene for a normal growth factor 

20 receptor, namely, ERBB2, into an oncogene. The ERBB2 to 17q11-q21 by in situ hybridization [Popescu et al., 1989, 
(26)]. By in situ hybridization to chromosomes derived from fibroblasts carrying a constitutional translocation between 
15 and 17, they showed that the ERBB2 gene was relocated to the derivative chromosome 15; the gene can thus be 
localized to 17q12-q21.32. By family linkage studies using multiple DNA markers in the 17q12-q21 region the ERBB2 
gene was placed on the genetic map of the region. 

25 [0085] lnterleukin-6 is a cytokine that was initially recognized as a regulator of immune and inflammatory responses, 
but also regulates the growth of many tumor cells, including prostate cancer. Overexpression of ERBB2 and ERBB3 
has been implicated in the neoplastic transformation of prostate cancer. Treatment of a prostate cancer cell line with 
IL6 induced tyrosine phosphorylation of ERBB2 and ERBB3, but not ERBB1/EGFR. The ERBB2 forms a complex with 
the gp130 subunit of the IL6 receptor in an IL6-dependent manner. This association was important because the inhi- 

30 bition of ERBB2 activity resulted in abrogation of IL6-induced MAPK activation. Thus, ERBB2 is a critical component 
of IL6 signaling through the MAP kinase pathway [Qiu et al., 1 998, (27)]. These findings showed how a cytokine receptor 
can diversify its signaling pathways by engaging with a growth factor receptor kinase. 

[0086] Overexpression of ERBB2 confers Taxol resistance in breast cancers. Overexpression of ERBB2 inhibits 
Taxol-induced apoptosis [Yu et al., 1998, (28)]. Taxol activates CDC2 kinase in MDA-MB-435 breast cancer cells, 

35 leading to cell cycle arrest at the G2/M phase and, subsequently, apoptosis. A chemical inhibitor of CDC2 and a dom- 
inant-negative mutant of CDC2 blocked Taxol-induced apoptosis in these cells. Overexpression of ERBB2 in MDA-MB- 
435 cells by transfection transcriptionally upregulates CDKN1A which associates with CDC2, inhibits Taxol-mediated 
CDC2 activation, delays cell entrance to G2/M phase, and thereby inhibits Taxol-induced apoptosis. In CDKN1A anti- 
sense-transfected MDA-MB-435 cells or in p21-/- MEF cells, ERBB2 was unable to inhibit Taxol-induced apoptosis. 

40 Therefore, CDKN1A participates in the regulation of a G2/M checkpoint that contributes to resistance to Taxol-induced 
apoptosis in ERBB2-overexpressing breast cancer cells. 

[0087] A secreted protein of approximately 68 kD was described, designated herstatin, as the product of an alternative 
ERBB2 transcript that retains intron 8 [Doherty et al., 1999, (29)]. This alternative transcript specifies 340 residues 
identical to subdomains I and II from the extracellular domain of p1 85ERBB2, followed by a unique C-terminal sequence 

45 of 79 amino acids encoded by intron 8. The recombinant product of the alternative transcript specifically bound to 
ERBB2-transfected cells and was chemically crosslinked to p1 85ERBB2, whereas the intron-encoded sequence alone 
also bound with high affinity to transfected cells and associated with p185 solubilized from cell extracts. The herstatin 
mRNA was expressed in normal human fetal kidney and liver, but was at reduced levels relative to p185ERBB2 mRNA 
in carcinoma cells that contained an amplified ERBB2 gene. Herstatin appears to be an inhibitor of p185ERBB2, be- 

50 cause it disrupts dimers, reduces tyrosine phosphorylation of p185, and inhibits the anchorage-independent growth of 
transformed cells that overexpress ERBB2. The HER2 gene is amplified and HER2 is overexpressed in 25 to 30% of 
breast cancers, increasing the aggressiveness of the tumor. Finally, it was found that a recombinant monoclonal anti- 
body against HER2 increased the clinical benefit of first-line chemotherapy in metastatic breast cancer that overex- 
presses HER2 [Slamon et al., 2001, (30)]. 

55 

GRB7 

[0088] Growth factor receptor tyrosine kinases (GF-RTKs) are involved in activating the cell cycle. Several substrates 
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of GF-RTKs contain Src-homology 2 (SH2) and SH3 domains. SH2 domain-containing proteins are a diverse group 
of molecules important in tyrosine kinase signaling. Using the CORT (cloning of receptor targets) method to screen a 
high expression mouse library, the gene for murine Grb7, which encodes a protein of 535 amino acids, was isolated 
[Margolis et al., 1992, (31)]. GRB7 is homologous to ras-GAP (ras-GTPase-activating protein). It contains an SH2 
5 domain and is highly expressed in liver and kidney. This gene defines the GRB7 family, whose members include the 
mouse gene Grb 10 and the human gene GRB14. 

[0089] A putative GRB7 signal transduction molecule and a GRB7V novel splice variant from an invasive human 
esophageal carcinoma was isolated [Tanaka et al., 1998, (32)]. Although both GRB7 isoforms shared homology with 
the Mig-10 cell migration gene of Caenorhabditis elegans, the GRB7V isoform lacked 88 basepairs in the C terminus; 

10 the resultant frameshift led to substitution of an SH2 domain with a short hydrophobic sequence. The wildtype GRB7 
protein, but not the GRB7V isoform, was rapidly tyrosyl phosphorylated in response to EGF stimulation in esophageal 
carcinoma cells. Analysis of human esophageal tumor tissues and regional lymph nodes with metastases revealed 
that GRB7V was expressed in 40% of GRB7-positive esophageal carcinomas. GRB7V expression was enhanced after 
metastatic spread to lymph nodes as compared to the original tumor tissues. Transfection of an antisense GRB7 RNA 

is expression construct lowered endogenous GRB7 protein levels and suppressed the invasive phenotype exhibited by 
esophageal carcinoma cells. These findings suggested that GRB7 isoforms are involved in cell invasion and metastatic 
progression of human esophageal carcinomas. By sequence analysis, The GRB7 gene was mapped to chromosome 
1 7q21 -q22, near the topoisomerase-2 gene [Dong et al., 1997, (33)]. GRB-7 is amplified in concert with HER2 in several 
breast cancer cell lines and that GRB-7 is overexpressed in both cell lines and breast tumors. GRB-7, through its SH2 

20 domain, binds tightly to HER2 such that a large fraction of the tyrosine phosphorylated HER2 in SKBR-3 cells is bound 
to GRB-7 [Stein et al., 1994, (34)]. 

GCSF, CSF3 

25 [0090] Granulocyte colony-stimulating factor (or colony stimulating factor-3) specifically stimulates the proliferation 
and differentiation of the progenitor cells for granulocytes. The partial amino acid sequence of purified GCSF protein 
was determined, and by using oligonucleotides as probes, several GCSF cDNA clones were isolated from a human 
squamous carcinoma cell line cDNA library [Nagata et al., 1986, (35)]. Cloning of human GCSF cDNA shows that a 
single gene codes for a 1 77- or 1 80-amino acid mature protein of molecular weight 1 9,600. The authors found that the 

30 GCSF gene has 4 introns and that 2 different polypeptides are synthesized from the same gene by differential splicing 
of mRNA. The 2 polypeptides differ by the presence or absence of 3 amino acids. Expression studies indicate that 
both have authentic GCSF activity. A stimulatory activity from a glioblastoma multiform cell line being biologically and 
biochemically indistinguishable from GCSF produced by a bladder cell line was found in 1 987. By somatic cell hybrid- 
ization and in situ chromosomal hybridization, the GCSF gene was mapped to 1 7q 1 1 in the region of the breakpoint 

35 in the 15; 17 translocation characteristic of acute promyelocytic leukemia [Le Beau et al., 1987, (36)]. Further studies 
indicated that the gene is proximal to the said breakpoint and that it remains on the rearranged chromosome 17. 
Southern blot analysis using both conventional and pulsed field gel electrophoresis showed no rearranged restriction 
fragments. By use of a full-length cDNA clone as a hybridization probe in human-mouse somatic cell hybrids and in 
flow-sorted human chromosomes, the gene for GCSF was mapped to 17q21-q22 lateron 

THRA, THRA1, ERBA, EAR7, ERBA2, ERBA3 

[0091] Both human and mouse DNA have been demonstrated to have two distantly related classes of ERBA genes 
and that in the human genome multiple copies of one of the classes exist [Jansson et al., 1983, (37)]. A cDNA was 

45 isolated derived from rat brain messenger RNAon the basis of homology to the human thyroid receptorgene [Thompson 
et al., 1987, (38)]. Expression of this cDNA produced a high-affinity binding protein for thyroid hormones. Messenger 
RNA from this gene was expressed in tissue-specific fashion, with highest levels in the central nervous system and no 
expression in the liver. An increasing body of evidence indicated the presence of multiple thyroid hormone receptors. 
The authors suggested that there may be as many as 5 different but related loci. Many of the clinical and physiologic 

so studies suggested the existence of multiple receptors. For example, patients had been identified with familial thyroid 
hormone resistance in which peripheral response to thyroid hormones is lost or diminished while neuronal functions 
are maintained. Thyroidologists recognize a form of cretinism in which the nervous system is severely affected and 
another form in which the peripheral functions of thyroid hormone are more dramatically affected. 
[0092] The cDNA encoding a specific form of thyroid hormone receptor expressed in human liver, kidney, placenta, 

55 and brain was isolated [Nakai et al., 1988, (39)]. Identical clones were found in human placenta. The cDNA encodes 
a protein of 490 amino acids and molecular mass of 54,824. Designated thyroid hormone receptor type alpha-2 
(THRA2), this protein is represented by mRNAs of different size in liver and kidney, which may represent tissue-specific 
processing of the primary transcript. 
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[0093] The THRA gene contains 10 exons spanning 27 kb of DNA. The last 2 exons of the gene are alternatively 
spliced. A 5-kbTHRA1 mRNA encodes a predicted 410-amino acid protein; a 2. 7-kbTHRA2mRNA encodes a 490-ami- 
no acid protein. A third isoform, TR-alpha-3, is derived by alternative splicing. The proximal 39 amino acids of the TH- 
alpha-2 specific sequences are deleted in TR-alpha-3. A second gene, THRB on chromosome 3, encodes 2 isoforms 

5 of TR-beta by alternative splicing. In 1989 the structure and function of the EAR1 and EAR7 genes was elucidated, 
both located on 17q21 [Miyajima et al., 1989, (40)]. The authors determined that one of the exons in the EAR7 coding 
sequence overlaps an exon of EAR1, and that the 2 genes are transcribed from opposite DNA strands. In addition, 
the EAR7 mRNA generates 2 alternatively spliced isoforms, referred to as EAR71 and EAR72, of which the EAR71 
protein is the human counterpart of the chicken c-erbA protein. 

w [0094] The thyroid hormone receptors, beta, alpha-1, and alpha-2 3 mRNAs are expressed in all tissues examined 
and the relative amounts of the three mRNAs were roughly parallel. None of the 3 mRNAs was abundant in liver, which 
is the major thyroid hormone-responsive organ. This led to the assumption that another thyroid hormone receptor may 
be present in liver. It was found that ERBA, which potentiates ERBB, has an amino acid sequence different from that 
of other known oncogene products and related to those of the carbonic anhydrases [Debuire et al., 1984, (41)]. ERBA 

15 potentiates ERBB by blocking differentiation of erythroblasts at an immature stage. Carbonic anhydrases participate 
in the transport of carbon dioxide in erythrocytes. In 1986 it was shown that the ERBA protein is a high-affinity receptor 
for thyroid hormone. The cDNA sequence indicates a relationship to steroid-hormone receptors, and binding studies 
indicate that it is a receptor for thyroid hormones. It is located in the nucleus, where it binds to DNA and activates 
transcription. 

20 [0095] Maternal thyroid hormone is transferred to the fetus early in pregnancy and is postulated to regulate brain 
development. The ontogeny of TR isoforms and related splice variants in 9 first-trimester fetal brains by semi-quanti- 
tative RT-PCR analysis has been investigated. Expression of the TR-beta-1, TR-alpha-1, and TR-alpha-2 isoforms 
was detected from 8.1 weeks' gestation. An additional truncated species was detected with the TR-alpha-2 primer set, 
consistent with the TR-alpha-3 splice variant described in the rat. All TR-alpha-derived transcripts were coordinately 

25 expressed and increased approximately 8-fold between 8.1 and 13.9 weeks' gestation. A more complex ontogenic 
pattern was observed for TR-beta-1 , suggestive of a nadir between 8.4 and 12.0 weeks' gestation. The authors con- 
cluded that these findings point to an important role for the TR-alpha-1 isoform in mediating maternal thyroid hormone 
action during first-trimester fetal brain development. 

[0096] The identification of the several types of thyroid hormone receptor may explain the normal variation in thyroid 

30 hormone responsiveness of various organs and the selective tissue abnormalities found in the thyroid hormone resist- 
ance syndromes. Members of sibships, who were resistant to thyroid hormone action, had retarded growth, congenital 
deafness, and abnormal bones, but had normal intellect and sexual maturation, as well as augmented cardiovascular 
activity. In this family abnormal T3 nuclear receptors in blood cells and fibroblasts have been demonstrated. The avail- 
ability of cDNAs encoding the various thyroid hormone receptors was considered useful in determining the underlying 

35 genetic defect in this family. 

[0097] The ERBA oncogene has been assigned to chromosome 17. The ERBA locus remains on chromosome 17 
in the t(1 5; 1 7) translocation of acute promyelocytic leukemia (APL). The thymidine kinase locus is probably translocated 
to chromosome 15; study of leukemia with t(17;21) and apparently identical breakpoint showed that TK was on 21q+. 
By in situ hybridization of a cloned DNA probe of c-erb-A to meiotic pachytene spreads obtained from uncultured 

40 spermatocytes it has been concluded that ERBA is situated at 17q21.33-17q22, in the same region as the break that 
generated the t(15;17) seen in APL. Because most of the grains were seen in 17q22, they suggested that ERBA is 
probably in the proximal region of 1 7q22 or at the junction between 1 7q22 and 1 7q21 .33. By in situ hybridization it has 
been demonstrated, that that ERBA remains at 17q11-q12 in APL, whereas TP53, at 17q21-q22, is translocated to 
chromosome 15. Thus, ERBA must be at 17q11.2 just proximal to the breakpoint in the APL translocation and just 

45 distal to it in the constitutional translocation. 

[0098] The aberrant THRA expression in nonfunctioning pituitary tumors has been hypothesized to reflect mutations 
in the receptor coding and regulatory sequences. They screened THRA mRNA and THRB response elements and 
ligand-binding domains for sequence anomalies. Screening THRA mRNA from 23 tumors by RNAse mismatch and 
sequencing candidate fragments identified 1 silent and 3 missense mutations, 2 in the common THRA region and 1 

50 that was specific for the alpha-2 isoform. No THRB response element differences were detected in 14 nonfunctioning 
tumors, and no THRB ligand-binding domain differences were detected in 23 nonfunctioning tumors. Therefore it has 
been suggested that the novel thyroid receptor mutations may be of functional significance in terms of thyroid receptor 
action, and further definition of their functional properties may provide insight into the role of thyroid receptors in growth 
control in pituitary cells. 

55 

RAR-alpha 

[0099] A cDNA encoding a protein that binds retinoic acid with high affinity has been cloned [Petkovich et al., 1987, 
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(42)]. The protein was found to be homologous to the receptors for steroid hormones, thyroid hormones, and vitamin 
D3, and appeared to be a retinoic acid-inducible transacting enhancer factor. Thus, the molecular mechanisms of the 
effect of vitamin A on embryonic development, differentiation and tumor cell growth may be similar to those described 
for other members of this nuclear receptor family. In general, the DNA-binding domain is most highly conserved, both 

5 within and between the 2 groups of receptors (steroid and thyroid); Using a cDNA probe, the RAR-alpha gene has 
been mapped to 1 7q21 by in situ hybridization [Mattei et al., 1988, (43)]. Evidence has been presented for the existence 
of 2 retinoic acid receptors, RAR-alpha and RAR-beta, mapping to chromosome 17q21.1 and 3p24, respectively. The 
alpha and beta forms of RAR were found to be more homologous to the 2 closely related thyroid hormone receptors 
alpha and beta, located on 1 7q1 1 .2 and 3p25-p21, respectively, than to any other members of the nuclear receptor 

10 family. These observations suggest that the thyroid hormone and retinoic acid receptors evolved by gene, and possibly 
chromosome, duplications from a common ancestor, which itself diverged rather early in evolution from the common 
ancestor of the steroid receptor group of the family. They noted that the counterparts of the human RARA and RARB 
genes are present in both the mouse and chicken. The involvement of RARA at the APL breakpoint may explain why 
the use of retinoic acid as a therapeutic differentiation agent in the treatment of acute myeloid leukemias is limited to 

15 APL. Almost all patients with APL have a chromosomal translocation t(15;17)(q22;q21). Molecular studies reveal that 
the translocation results in a chimeric gene through fusion between the PML gene on chromosome 15 and the RARA 
gene on chromosome 17. A hormone-dependent interaction of the nuclear receptors RARA and RXRA with CLOCK 
and MOP4 has been presented. 

20 CDC18 L, CDC 6 

[0100] In yeasts, Cdc6 (Saccharomyces cerevisiae) and Cdc18 (Schizosaccharomyces pombe) associate with the 
origin recognition complex (ORC) proteins to render cells competent for DNA replication. Thus, Cdc6 has a critical 
regulatory role in the initiation of DNA replication in yeast. cDNAs encoding Xenopus and human homologues of yeast 

25 CDC6 have been isolated [Williams et al., 1997, (44)]. They designated the human and Xenopus proteins p62(cdc6). 
Independently, in a yeast 2-hybrid assay using PCNA as bait, cDNAs encoding the human CDC6/Cdc 18 homologue 
have been isolated [Saha et al, 1998, (45)]. These authors reported that the predicted 560-amino acid human protein 
shares approximately 33% sequence identity with the 2 yeast proteins. On Western blots of HeLa cell extracts, human 
CDC6/Cdc18 migrates as a 66-kD protein. Although Northern blots indicated that CDC6/Cdc18 mRNA levels peak at 

30 the onset of S phase and diminish at the onset of mitosis in HeLa cells, the authors found that total CDC6/Cdc1 8 protein 
level is unchanged throughout the cell cycle. Immunofluorescent analysis of epitope-tagged protein revealed that hu- 
man CDC6/Cdc18 is nuclear in G1 and cytoplasmic in S-phase cells, suggesting that DNA replication may be regulated 
by either the translocation of this protein between the nucleus and cytoplasm or by selective degradation of the protein 
in the nucleus. Immunoprecipitation studies showed that human CDC6/Cdc18 associates in vivo with cyclin A, 

35 CDK2,and ORC1 . The association of cyclin-CDK2 with CDC6/Cdc18 was specifically inhibited by a factor present in 
mitotic cell extracts. Therefore it has been suggested that if the interaction between CDC6/Cdc18 with the S phase- 
promoting factor cyclin-CDK2 is essential for the initiation of DNA replication, the mitotic inhibitor of this interaction 
could prevent a premature interaction until the appropriate time in G1 Cdc6 is expressed selectively in proliferating but 
not quiescent mammalian cells, both in culture and within tissues in intact animals [Yan et al., 1998, (46)]. During the 

40 transition from a growth-arrested to a proliferative state, transcription of mammalian Cdc6 is regulated by E2F proteins, 
as revealed by a functional analysis of the human Cdc6 promoter and by the ability of exogenously expressed E2F 
proteins to stimulate the endogenous Cdc6 gene. Immunodepletion of Cdc6 by microinjection of anti-Cdc6 antibody 
blocked initiation of DNA replication in a human tumor cell line. The authors concluded that expression of human Cdc6 
is regulated in response to mitogenic signals through transcriptional control mechanisms involving E2F proteins, and 

45 that Cdc6 is required for initiation of DNA replication in mammalian cells. 

[0101] Using a yeast 2-hybrid system, co-purification of recombinant proteins, and immunoprecipitation, it has been 
demonstrated lateron that an N-terminal segment of CDC6 binds specifically to PR48, a regulatory subunit of protein 
phosphatase 2A (PP2A). The authors hypothesized that dephosphorylation of CDC6 by PP2A, mediated by a specific 
interaction with PR48 or a related B-double prime protein, is a regulatory event controlling initiation of DNA replication 

so in mammalian cells. By analysis of somatic cell hybrids and by fluorescence in situ hybridization the human p62(cdc6) 
gene has been to 17q21.3. 

TOP2A, TOP2 

55 [0102] DNA topoisomerases are enzymes that control and alter the topologic states of DNA in both prokaryotes and 
eukaryotes. Topoisomerase II from eukaryotic cells catalyzes the relaxation of supercoiled DNA molecules, catenation, 
decatenation, knotting, and unknotting of circular DNA. It appears likely that the reaction catalyzed by topoisomerase 
II involves the crossing-over of 2 DNA segments. It has been estimated that there are about 100,000 molecules of 
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topoisomerase II per HeLa cell nucleus, constituting about 0.1% of the nuclear extract. Since several of the abnormal 
characteristics of ataxia-telangiectasia appear to be due to defects in DNA processing, screening for these enzyme 
activities in 5 AT cell lines has been performed [Singh et al., 1988, (47)]. In comparison to controls, the level of DNA 
topoisomerase II, determined by unknotting of P4 phage DNA, was reduced substantially in 4 of these cell lines and 
5 to a lesser extent in the fifth. DNA topoisomerase I, assayed by relaxation of supercoil DNA, was found to be present 
at normal levels. 

[0103] The entire coding sequence of the human TOP2 gene has been determined [Tsai-Pflugfelder et al., 1 988, (48)]. 
[0104] In addition human cDNAs that had been isolated by screening a cDNA library derived from a mechlorethamine- 
resistant Burkitt lymphoma cell line (Raji-HN2) with a Drosophila Topo II cDNA had been sequenced [Chung et al., 

10 1989, (49)]. The authors identified 2 classes of sequence representing 2 TOP2 isoenzymes, which have been named 
TOP2A and TOP2B. The sequence of 1 of the TOP2A cDNAs is identical to that of an internal fragment of the TOP2 
cDNA isolated by Tsai-Pflugfelder et al., 1 988 (48). Southern blot analysis indicated that the TOP2A and TOP2B cDNAs 
are derived from distinct genes. Northern blot analysis using a TOP2A-specific probe detected a 6.5-kb transcript in 
the human cell line U937. Antibodies against a TOP2A peptide recognized a 170-kD protein in U937 cell lysates. 

15 Therefore it was concluded that their data provide genetic and immunochemical evidence for 2 TOP2 isozymes. The 
complete structures of the TOP2A and TOP2B genes has been reported [Lang et al., 1998, (50)]. The TOP2A gene 
spans approximately 30 kb and contains 35 exons. 

[0105] Tsai-Pflugfelder et al., 1988 (48) showed that the human enzyme is encoded by a single-copy gene which 
they mapped to 1 7q21 -q22 by a combination of in situ hybridization of a cloned fragment to metaphase chromosomes 

20 and by Southern hybridization analysis with a panel of mouse-human hybrid cell lines. The assignment to chromosome 
17 has been confirmed by the study of somatic cell hybrids. Because of co-amplification in an adenocarcinoma cell 
line, it was concluded that the TOP2Aand ERBB2 genes maybe closely linked on chromosome 17 [Keith et al., 1992, 
(51)]. Using probes that detected RFLPs at both the TOP2A and TOP2B loci, the demonstrated heterozygosity at a 
frequency of 0.1 7 and 0.37 for the alpha and beta loci, respectively. The mouse homologue was mapped to chromosome 

25 11 [Kingsmore et al., 1993, (52)]. The structure and function of type II DNA topoisomerases has been reviewed [Watt 
et al., 1994, (53)]. DNA topoisomerase ll-alpha is associated with the pol II holoenzyme and is a required component 
of chromatin-dependent co-activation. Specific inhibitors of topoisomerase II blocked transcription on chromatin tem- 
plates, but did not affect transcription on naked templates. Addition of purified topoisomerase ll-alpha reconstituted 
chromatin-dependent activation activity in reactions with core pol II. Therefore the transcription on chromatin templates 

30 seems to result in the accumulation of superhelical tension, making the relaxation activity of topoisomerase II essential 
for productive RNA synthesis on nucleosomal DNA. 

IGFBP4 

35 [0106] Six structurally distinct insulin-like growth factor binding proteins have been isolated and their cDNAs cloned: 
IGFBP1, IGFBP2, IGFBP3, IGFBP4, IGFBP5 and IGFBP6. The proteins display strong sequence homologies, sug- 
gesting that they are encoded by a closely related family of genes. The IGFBPs contain 3 structurally distinct domains 
each comprising approximately one-third of the molecule. The N-terminal domain 1 and the C-terminal domain 3 of 
the 6 human IGFBPs show moderate to high levels of sequence identity including 12 and 6 invariant cysteine residues 

40 in domains 1 and 3, respectively (IGFBP6 contains 10 cysteine residues in domain 1), and are thought to be the IGF 
binding domains. Domain 2 is defined primarily by a lack of sequence identity among the 6 IGFBPs and by a lack of 
cysteine residues, though it does contain 2 cysteines in IGFBP4. Domain 3 is homologous to the thyroglobulin type I 
repeat unit. Recombinant human insulin-like growth factor binding proteins 4, 5, and 6 have been characterized by 
their expression in yeast as fusion proteins with ubiquitin [Kiefer et al., 1992, (54)]. Results of the study suggested to 

45 the authors that the primary effect of the 3 proteins is the attenuation of IGF activity and suggested that they contribute 
to the control of IGF-mediated cell growth and metabolism. 

[0107] Based on peptide sequences of a purified insulin-like growth factor-binding protein (IGFBP) rat IGFBP4 has 
been cloned by using PCR [Shimasaki et al., 1990, (55)]. They used the rat cDNA to clone the human ortholog from 
a liver cDNA library. Human IGFBP4 encodes a 258-amino acid polypeptide, which includes a 21 -amino acid signal 

so sequence. The protein is very hydrophilic, which may facilitate its ability as a carrier protein for the IGFs in blood. 
Northern blot analysis of rat tissues revealed expression in all tissues examined, with highest expression in liver. It 
was stated that IGFBP4 acts as an inhibitor of IGF-induced bone cell proliferation. The genomic region containing the 
IGFBP gene. The gene consists of 4 exons spanning approximately 1 5 kb of genomic DNA has been examined [Zazzi 
et al., 1998, (56)]. The upstream region of the gene contains a TATA box and a cAMP-responsive promoter. 

55 [0108] By in situ hybridization, the IGFBP4 gene was mapped to 17q12-q21 [Bajalica et al., 1992, (57)]. Because 
the hereditary breast-ovarian cancer gene BRCA1 had been mapped to the same region, it has been investigated 
whether IGFBP4 is a candidate gene by linkage analysis of 22 BRCA1 families; the finding of genetic recombination 
suggested that it is not the BRCA1 gene [Tonin et al., 1993, (58)]. 
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EBI 1, CCR7, CMKBR7 

[0109] Using PCR with degenerate oligonucleotides, a lymphoid-specific member of the G protein-coupled receptor 
family has been identified and mapped mapped to 1 7q12-q21 .2 by analysis of human/mouse somatic cell hybrid DNAs 

5 and fluorescence in situ hybridization. It has been shown that this receptor had been independently identified as the 
Epstein-Barr-induced cDNA (symbol EBI1 ) [Birkenbach et al., 1993, (59)]. EBI1 is expressed in normal lymphoid tissues 
and in several B- and T-lymphocyte cell lines. While the function and the ligand for EB1 1 remains unknown, its sequence 
and gene structure suggest that it is related to receptors that recognize chemoattractants, such as interleukin-8, 
RANTES, C5a, and fMet-Leu-Phe. Like the chemoattractant receptors, EBI1 contains intervening sequences near its 

10 5-prime end; however, EBI1 is unique in that both of its introns interrupt the coding region of the first extracellular 
domain. Mouse Ebi1 cDNA has been isolated and found to encode a protein with 86% identity to the human homologue. 
[0110] Subsets of murine CD4+ T cells localize to different areas of the spleen after adoptive transfer. Naive and T 
helper-1 (TH1) cells, which express CCR7, home to the periarteriolar lymphoid sheath, whereas activated TH2 cells, 
which lack CCR7, form rings at the periphery of the T-cell zones near B-cell follicles. It has been found that retroviral 

15 transduction of TH2 cells with CCR7 forced them to localize in a TH1-like pattern and inhibited their participation in B- 
cell help in vivo but not in vitro. Apparently differential expression of chemokine receptors results in unique cellular 
migration patterns that are important for effective immune responses. 

[0111] CCR7 expression divides human memory T cells into 2 functionally distinct subsets. CCR7-memory cells 
express receptors for migration to inflamed tissues and display immediate effectorfunction. In contrast, CCR7 + memory 
20 cells express lymph node homing receptors and lack immediate effectorfunction, but efficiently stimulate dendritic cells 
and differentiate into CCR7" effector cells upon secondary stimulation. The CCR7 + and CCR7" T cells, named central 
memory (T-CM) and effector memory (T-EM), differentiate in a step-wise fashion from naive T cells, persist for years 
after immunization, and allow a division of labor in the memory response. 

[0112] CCR7 expression in memory CD8 + T lymphocyte responses to HIV and to cytomegalovirus (CMV) tetramers 

25 has been evaluated. Most memory T lymphocytes express CD45RO, but a fraction express instead the CD45RA mark- 
er. Flow cytometric analyses of marker expression and cell division identified 4 subsets of HIV- and CMV-specific CD8+ 
T cells, representing a lineage differentiation pattern: CD45RA + CCR7 + (double-positive); CD45RA"CCR7 + ; 
CD45RA CCR7- (double-negative); CD45RA + CCR7-. The capacity for cell division, as measured by 5-(and 6-)carboxyl- 
fluorescein diacetate, succinimidyl ester, and intracellular staining for the Ki67 nuclear antigen, is largely confined to 

30 the CCR7 + subsets and occurred more rapidly in cells that are also CD45RA + . Although the double-negative cells did 
not divide or expand after stimulation, they did revert to positivity for either CD45RA or CCR7 or both. The 
CD45RA + CCR7- cells, considered to be terminally differentiated, fail to divide, but do produce interferon-gamma and 
express high levels of perform. The representation of subsets specific for CMV and for HIV is distinct. Approximately 
70% of HIV-specific CD8 + memory T cells are double-negative or preterminally differentiated compared to 40% of 

35 CMV-specific cells. Approximately 50% of the CMV-specific CD8 + memory T cells are terminally differentiated com- 
pared to fewer than 10% of the HIV-specific cells. It has been proposed that terminally differentiated CMV-specific cells 
are poised to rapidly intervene, while double-positive precursor cells remain for expansion and replenishment of the 
effector cell pool. Furthermore, high-dose antigen tolerance and the depletion of HIV-specific CD4 + helper T-cell activity 
may keep the HIV-specific memory CD8 + T cells at the double-negative stage, unable to differentiate to the terminal 

40 effector state. B lymphocytes recirculate between B cell-rich compartments (follicles or B zones) in secondary lymphoid 
organs, surveying for antigen. After antigen binding, B cells move to the boundary of B and T zones to interact with T- 
helper cells. Furthermore it has been demonstrated that antigen-engaged B cells have increased expression of CCR7, 
the receptor for the T-zone chemokines CCL19 (also known as ELC) and CCL21, and that they exhibit increased 
responsiveness to both chemoattractants. In mice lacking lymphoid CCL19 and CCL21 chemokines, or with B cells 

45 that lack CCR7, antigen engagement fails to cause movement to the T zone. Using retroviral-mediated gene transfer, 
the authors demonstrated that increased expression of CCR7 is sufficient to direct B cells to the T zone. Reciprocally, 
overexpression of CXCR5, the receptor for the B-zone chemokine CXCL13, is sufficient to overcome antigen-induced 
B-cell movement to the T zone. This points toward a mechanism of B-cell relocalization in response to antigen, and 
established that cell position in vivo can be determined by the balance of responsiveness to chemoattractants made 

so in separate but adjacent zones. 

BAF57, SMARCE 1 

[0113] The SWI/SNF complex in S. cerevisiae and Drosophila is thought to facilitate transcriptional activation of 
55 specific genes by antagonizing chromatin-mediated transcriptional repression. The complex contains an ATP-depend- 
ent nucleosome disruption activity that can lead to enhanced binding of transcription factors. The BRGI/brm-associated 
factors, or BAF, complex in mammals is functionally related to SWI/SNF and consists of 9 to 12 subunits, some of 
which are homologous to SWI/SNF subunits. A 57-kD BAF subunit, BAF57, is present in higher eukaryotes, but not in 
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yeast. Partial coding sequence has been obtained from purified BAF57 from extracts of a human cell line [Wang et al., 
1998, (60)]. Based on the peptide sequences, they identified cDNAs encoding BAF57. The predicted 411 -amino acid 
protein contains an HMG domain adjacent to a kinesin-like region. Both recombinant BAF57 and the whole BAF com- 
plex bind 4-way junction (4WJ) DNA, which is thought to mimic the topology of DNA as it enters or exits the nucleosome. 

5 The BAF57 DNA-binding activity has characteristics similar to those of other HMG proteins. It was found that complexes 
with mutations in the BAF57 HMG domain retain their DNA-binding and nucleosome-disruption activities. They sug- 
gested that the mechanism by which mammalian SWI/SNF-like complexes interact with chromatin may involve recog- 
nition of higher-order chromatin structure by 2 or more DNA-binding domains. RNase protection studies and Western 
blot analysis revealed that BAF57 is expressed ubiquitously. Several lines of evidence point toward the involvement 

10 of SWI/SNF factors in cancer development [Klochendler-Yeivin et al., 2002, (61)]. Moreover, SWI/SNF related genes 
are assigned to chromosomal regions that are frequently involved in somatic rearrangements in human cancers [Ring 
et al., 1998, (62)]. In this respect it is interesting that some of the SWI/SNF family members (i.e. SMARCC1 , SMARCC2, 
SMARCD1 and SMARCD22 are neighboring 3 of the eucaryotic ARCHEONs we have identified (i.e. 3p21-p24, 
12q13-q14 and 17q respectively)and which are part of the present invention. In this invention we could also map 

15 SMARCE1/BAF57 to the 17q12 region by PCR karyotyping. 

KRT 10.K10 

[0114] Keratin 10 is an intermediate filament (IF) chain which belongs to the acidic type I family and is expressed in 

20 terminally differentiated epidermal cells. Epithelial cells almost always co-express pairs of type I and type II keratins, 
and the pairs that are co-expressed are highly characteristic of a given epithelial tissue. For example, in human epi- 
dermis, 3 different pairs of keratins are expressed: keratins 5 (type II) and 14 (type I), characteristic of basal or prolif- 
erative cells; keratins 1 (type II) and 10 (type I), characteristic of superbasal terminally differentiating cells; and keratins 
6 (type II) and 16 (type I) (and keratin 17 [type I]), characteristic of cells induced to hyper-proliferate by disease or 

25 injury, and epithelial cells grown in cell culture. The nucleotide sequence of a 1 ,700 bp cDNA encoding human epidermal 
keratin 10 (56.5 kD) [Darmon et al., 1987, (63)] has been published as well as the complete amino acid sequence of 
human keratin 10 [Zhou et al., 1988, (64)]. Polymorphism of the KRT 10 gene, restricted to insertions and deletions of 
the glycine-richquasipeptide repeats that form the glycine-loop motif in the C-terminal domain, have been extensively 
described [Korge et al., 1992, (65)]. 

30 [011 5] By use of specific cDNA clones in conjunction with somatic cell hybrid analysis and in situ hybridization, KRT1 0 
gene has been mapped to 17q12-q21 in a region proximal to the breakpoint at 17q21 that is involved in a t(17;21)(q21; 
q22) translocation associated with a form of acute leukemia. KRT10 appeared to be telomeric to 3 other loci that map 
in the same region: CSF3, ERBA1 , and HER2 [Lessin et al., 1988, (66)]. NGFR and HOX2 are distal to K9. It has been 
demonstrated that the KRT10, KRT13, and KRT15genes are located in the same large pulsed field gel electrophoresis 

35 fragment [Romano et al., 1991, (67)]. A correlation of assignments of the 3 genes makes 17q21-q22 the likely location 
of the cluster. Transgenic mice expressing a mutant keratin 10 gene have the phenotype of epidermolytic 
hyperkeratosis , thus suggesting that a genetic basis for the human disorder resides in mutations in genes encoding 
suprabasal keratins KRT1 or KRT10 [Fuchs et al 1992, (68)]. The authors also showed that stimulation of basal cell 
proliferation can result from a defect in suprabasal cells and that distortion of nuclear shape or alterations in cytokinesis 

40 can occur when an intermediate filament network is perturbed. In a family with keratosis palmaris et plantaris without 
blistering either spontaneously or in response to mild mechanical or thermal stress and with no involvement of the skin 
and parts of the body other than the palms and soles, a tight linkage to an insertion-deletion polymorphism in the C- 
terminal coding region of the KRT10 gene (maximum lod score = 8.36 at theta = 0.00) was found [Rogaev et al., 1993, 
(69)]. It is noteworthy that it was a rare, high molecular weight allele of the KRT10 polymorphism that segregated with 

45 the disorder. The allele was observed once in 96 independent chromosomes from unaffected Caucasians. The KRT1 0 
polymorphism arose from the insertion/deletion of imperfect (CCG)n repeats within the coding region and gave rise to 
a variable glycine loop motif in the C-terminal tail of the keratin 10 protein. It is possible that there was a pathogenic 
role for the expansion of the imperfect trinucleotide repeat. 

50 KRT12.K12 

[0116] Keratins are a group of water-insoluble proteins that form 10 nm intermediate filaments in epithelial cells. 
Approximately 30 different keratin molecules have been identified. They can be divided into acidic and basic-neutral 
subfamilies according to their relative charges, immunoreactivity, and sequence homologies to types I and II wool 
55 keratins, respectively. In vivo, a basic keratin usually is co-expressed and 'paired' with a particular acidic keratin to 
form a heterodimer. The expression of various keratin pairs is tissue specific, differentiation dependent, and develop- 
mentally regulated. The presence of specific keratin pairs is essential for the maintenance of the integrity of epithelium. 
For example, mutations in human K14/K5 pair and the K10/K1 pair underlie the skin diseases, epidermolysis bullosa 
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simplex and epidermolytic hyperkeratosis, respectively. Expression of the K3 and K12 keratin pair have been found in 
the cornea of a wide number of species, including human, mouse, and chicken, and is regarded as a marker for corneal- 
type epithelial differentiation. The murine Krt1 2 (Krt1 .12) gene and demonstrated that its expression is corneal epithelial 
cell specific, differentiation dependent, and developmentally regulated [Liu et al., 1993, (70)]. The corneal-specific 

5 nature of keratin 12 gene expression signifies keratin 12 plays a unique role in maintaining normal corneal epithelial 
function. Nevertheless, the exact function of keratin 12 remains unknown and no hereditary human corneal epithelial 
disorder has been linked directly to the mutation in the keratin 12 gene. As part of a study of the expression profile of 
human corneal epithelial cells, a cDNA with an open reading frame highly homologous to the cornea-specific mouse 
keratin 12 gene has been isolated [Nishida et al., 1996, (71)]. To elucidate the function of keratin 12 knockout mice 

10 lacking the Krt1.12 gene have been created by gene targeting techniques. The heterozygous mice appeared normal. 
Homozygous mice developed normally and suffered mild corneal epithelial erosion. The corneal epithelia were fragile 
and could be removed by gentle rubbing of the eyes or brushing. The corneal epithelium of the homozygotes did not 
express keratin 1 2 as judged by immunohistochemistry, Western immunoblot analysis with epitope-specific anti-keratin 
12 antibodies, Northern hybridization, and in situ hybridization with an antisense keratin 12 riboprobe. The KRT12gene 

15 has been mapped to 1 7q by study of radiation hybrids and localized it to the type I keratin cluster in the interval between 
D17S800 and D17S930 (17q12-q21) [Nishida et al., 1997, (72)]. The authors presented the exon-intron boundary 
structure of the KRT12 gene and mapped the gene to 17q12 by fluorescence in situ hybridization. The gene contains 
7 introns, defining 8 exons that cover the coding sequence. Together the exons and introns span approximately 6 kb 
of genomic DNA. 

20 [0117] Meesmann corneal dystrophy is an autosomal dominant disorder causing fragility of the anterior corneal ep- 
ithelium, where the cornea-specific keratins K3 and K12 are expressed. Dominant-negative mutations in these keratins 
might be the cause of Meesmann corneal dystrophy. Indeed, linkage of the disorder to the K12 locus in Meesmann's 
original German kindred [Meesmann and Wilke, 1939, (73)] with Z(max) = 7.53 at theta = 0.0 has been found. In 2 
pedigrees from Northern Ireland, they found that the disorder co-segregated with K12 in one pedigree and K3 in the 

25 other. Heterozygous missense mutations in K3 or in K12 (R135T, V143L,) in each family have been identified. All these 
mutations occurred in highly conserved keratin helix boundary motifs, where dominant mutations in other keratins have 
been found to compromise cytoskeletal function severely, leading to keratinocyte fragility. 

[0118] The regions of the human KRT12 gene have been sequenced to enable mutation detection for all exons using 
genomic DNA as a template [Corden et al., 2000, (74)]. The authors found that the human genomic sequence spans 
30 5,919 bp and consists of 8 exons. A microsatellite dinucleotide repeat was identified within intron 3, which was highly 
polymorphic and which they developed for use in genotype analysis. In addition, 2 mutations in the helix initiation motif 
of K12 were found in families with Meesmann corneal dystrophy. In an American kindred, a missense M129T mutation 
was found in the KRT12 gene. They stated that a total of 8 mutations in the KRT12 gene had been reported. 

35 Genetic interactions within ARCHEONs 

[0119] Genes involved in genomic alterations (amplifications, insertions, translocations, deletions, etc.) exhibit 
changes in their expression pattern. Of particular interest are gene amplifications, which account for gene copy numbers 
>2 per cell or deletions accounting for gene copy numbers <2 per cell. Gene copy number and gene expression of the 

40 respective genes do not necessarily correlate. Transcriptional overexpression needs an intact transcriptional context, 
as determined by regulatory regions at the chromosomal locus (promotor, enhancer and silencer), and sufficient 
amounts of transcriptional regulators being present in effective combinations. This is especially true for genomic re- 
gions, which expression is tightly regulated in specific tissues or during specific developmental stages. ARCHEONs 
are specified by gene clusters of more than two genes being directly neighboured or in chromosomal order, interspersed 

45 by a maximum of 10, preferably 7, more preferably 5 or at least 1 gene. The interspersed genes are also co-amplified 
but do not directly interact with the ARCHEON. Such an ARCHEON may spread over a chromosomal region of a 
maximum of 20, more preferably 10 or at least 6 Megabases. The nature of an ARCHEON is characterized by the 
simultaneous amplification and/or deletion and the correlating expression (i.e. upregulation or downregulation respec- 
tively) of the encompassed genes in a specific tissue, cell type, cellular or developmental state or time point. Such 

50 ARCHEONs are commonly conserved during evolution, as they play critical roles during cellular development. In case 
of these ARCHEONs whole gene clusters are overexpressed upon amplification as they harbor self-regulatory feedback 
loops, which stabilize gene expression and/or biological effector function even in abnormal biological settings, or are 
regulated by very similar transcription factor combinations, reflecting their simultaneous function in specific tissues at 
certain developmental stages. Therefore, the gene copy numbers correlates with the expression level especially for 

55 genes in gene clusters functioning as ARCHEONs. In case of abnormal gene expressions in neoplastic lesions it is of 
great importance to know whether the self-regulatory feedback loops have been conserved as they determine the 
biological activity of the ARCHEON gene members. 

[0120] The intensive interaction between genes in ARCHEONs is described for the 17q12 ARCHEON (Fig. 1) by 
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way of illustration not by limitation. In one embodiment the presence or absence of alterations of genes within distinct 
genomic regions are correlated with each other, as exemplified for breast cancer cell lines (Fig. 3 and Fig. 4). This 
confers to the discovery of the present invention, that multiple interactions of said gene products of defined chromo- 
somal localizations happen, that according to their respective alterations in abnormal tissue have predictive, diagnostic, 

5 prognostic and/or preventive and therapeutic value. These interactions are mediated directly or indirectly, due to the 
fact that the respective genes are part of interconnected or independent signaling networks or regulate cellular behavior 
(differentiation status, proliferative and /or apoptotic capacity, invasiveness, drug responsiveness, immune modulatory 
activities) in a synergistic, antagonistic or independent fashion. The order of functionally important genes within the 
ARCHEONs has been conserved during evolution (e.g. the ARCHEON on human chromosom 17q12 is present on 

10 mouse chromosome 11). Moreover, it has been found that the 17q 12 ARCHEON is also present on human chromosome 
3p21 and 12q13, both of which are also involved in amplification events and in tumor development. Most probably 
these homologous ARCHEONs were formed by duplications and rearrangements during vertebrate evolution. Homol- 
ogous ARCHEONs consist of homologous genes and/or isoforms of specific gene families (e.g. RARA or RARB or 
RARG, THRA or THRB, TOP2A or TOP2B, RABSA or RABSB, BAF170 or BAF 1 55, BAF60A or BAF60B, WNT5A or 

15 WNT5B, IGFBP4 or IGFBP6). Moreover these regions are flanked by homologous chromosomal gene clusters (e.g. 
CACN, SCYA, HOX, Keratins). These ARCHEONs have diverged during evolution to fulfill their respective functions 
in distinct tissues (e.g. the 17q12 ARCHEON has one of its main functions in the central nervous system). Due to their 
tissue specific function extensive regulatory loops control the expression of the members of each ARCHEON. During 
tumor development these regulations become critical for the characteristics of the abnormal tissues with respect to 

20 differentiation, proliferation, drug responsiveness, invasiveness. It has been found that the co-amplification of genes 
within ARCHEONs can lead to co-expression of the respective gene products. Some of said genes also exhibit addi- 
tional mutations or specific patterns of polymorphisms, which are substantial for the oncogenic capacities of these 
ARCHEONs. It is one of the critical features of such amplicons, which members of the ARCHEON have been conserved 
during tumor formation (e.g. during amplification and deletion events), thereby defining these genes as diagnostic 

25 marker genes. Moreover, the expression of the certain genes within the ARCHEON can be influenced by other members 
of the ARCHEON, thereby defining the regulatory and regulated genes as target genes for therapeutic intervention. It 
was also observed, that the expression of certain members of the ARCHEON is sensitive to drug treatment (e.g. TOP02 
alpha, RARA, THRA, HER-2) which defines these genes as "marker genes". Moreover several other genes are suitable 
for therapeutic intervention by antibodies (CACNB1, EBI1), ligands (CACNB1) or drugs like e.g. kinase inhibitors 

30 (CrkRS, CDC6). The following examples of interactions between members of ARCHEONs are offered by way of illus- 
tration, not by way of limitation. 

[0121] EBI1/CCR7 is lymphoid-specific member of the G protein-coupled receptor family. EBI1 recognizes chem- 
oattractants, such as interleukin-8, SCYAs, Rantes, C5a, and fMet-Leu-Phe. The capacity for cell division is largely 
confined to the CCR7 + subsets in lymphocytes. Double-negative cells did not divide or expand after stimulation. CCR7- 

35 cells, considered to be terminally differentiated, fail to divide, but do produce interferon-gamma and express high levels 
of perforin. EBI1 is induced by viral activities such as the Eppstein-Barr-Virus. Therefore, EBI1 is associated with 
transformation events in lymphocytes. A functional role of EBI1 during tumor formation in non-lymphoid tissues has 
been investigated in this invention. Interestingly, also ERBA and ERBB, located in the same genomic region, are as- 
sociated with lymphocyte transformation. Moreover, ligands of the receptor (i.e. SCYA5/Rantes) are in genomic prox- 

40 imity on 17q. Abnormal expression of both of these factors in lymphoid and non-lymphoid tissues establishes an au- 
torgulatory feedback loop, inducing signaling events within the respective cells. Expression of lymphoid factors has 
effect on immune cells and modulates cellular behavior. This is of particular interest with regard to abnormal breast 
tissue being infiltrated by lymphocytes. In line with this, another immunmodulatory and proliferation factor is located 
nearby on 1 7q1 2. Granulocyte colony-stimulating factor (GCSF3) specifically stimulates the proliferation and differen- 

45 tiation of the progenitor cells for granulocytes. A stimulatory activity from a glioblastoma multiforme cell line being 
biologically and biochemically indistinguishable from GCSF produced by a bladder cell line has also been found. Col- 
ony-stimulating factors not only affects immune cells, but also induce cellular responses of non-immune cells, indicating 
possible involvement in tumor development upon abnormal expression. In addition several other genes of the 17q12 
ARCHEON are involved in proliferation, survival, differentiation of immune cells and/or lymphoblastic leukemia, such 

50 as MLLT6, ZNF144 and ZNFN1 A3, again demonstrating the related functions of the gene products in interconnected 
key processes within specific cell types. Aberrant expression of more than one of these genes in non-immune cells 
constitutes signalling activities, that contribute to the oncogenic activities that derive solely from overexpression of the 
Her-2/neu gene. 

[0122] PPARBP has been found in complex with the tumorsuppressor gene of the p53 family. Moreover, PPARBP 
55 also binds to PPAR-alpha (PPARA), RAR-alpha (RARA), RXR, THRA and TR-beta-1 . Due to it's ability to bind to thyroid 
hormone receptors it has been named TRIP2 and TRAP220. In this complexes PPARBP affects gene regulatory ac- 
tivities. Interestingly, PPARBP is located in genomic proximity to its interaction partners THRA and RARA. We have 
found PPARBP to be co-amplified with THRA and RARA in tumor tissue. THRA has been isolated from avian eryth- 
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roblastosis virus in conjunction with ERBB and therefore was named ERBA. ERBA potentiates ERBB by blocking 
differentiation of erythroblasts at an immature stage. ERBA has been shown to influence ERBB expression. In this 
setting deletions of C-terminal portions of the THRA gene product are of influence. Aberrant THRA expression has 
also been found in nonfunctioning pituitary tumors, which has been hypothesized to reflect mutations in the receptor 

5 coding and regulatory sequences. THRA function promotes tumor cell development by regulating gene expression of 
regulatory genes and by influencing metabolic activities (e.g. of key enzymes of alternative metabolic pathways in 
tumors such as malic enzyme and genes responsible for lipogenesis). The observed activities of nuclear receptors not 
only reflect their transactivating potential, but are also due to posttranscriptional activities in the absence or presence 
of ligands. Co-amplification of THRA /ERBA and ERBB has been shown, but its influence on tumor development has 

10 been doubted as no overexpression could be demonstrated in breast tumors [van de Vijver et al., 1987, (75)]. THRA 
and RARA are part of nuclear receptor family whose function can be mediated as monomers, homodimers or het- 
erodimers. RARA regulates differentiation of a broad spectrum of cells. Interactions of hormones with ERBB expression 
has been investigated. Ligands of RARA can inhibit the expression of amplified ERBB genes in breast tumors [Offter- 
dinger et al., 1998, (76)]. As being part of this invention co-amplification and co-expression of THRA and RARA could 

15 be shown. It was also found that multiple genes, which are regulated by members of the thyroid hormone receptor - 
and retinoic acid receptor family, are differentially expressed in tumor samples, corresponding to their genomic alter- 
ations (amplification, mutation, deletion). These hormone receptor genes and respective target genes are useful to 
discriminate patient samples with respect to clinical features. 

[0123] By expression analysis of multiple normal tissues, tumor samples and tumor cell lines and subsequent clus- 
20 tering of the 17q 12 region, it was found that the expression profile of Her-2/neu positive tumor cells and tumor samples 
exhibits similarities with the expression pattern of tissue from the central nervous system (Fig. 2). This is in line with 
the observed malformations in the central nervous system of Her-2/neu and THRA knock-out mice. Moreover, it was 
found that NEUROD2, a nuclear factor involved specifically in neurogenesis, is commonly expressed in the respective 
samples. This led to the definition of the 17q12 Locus as being an "ARCHEON", whose primary function in normal 
25 organ development is defined to the central nervous system. Surprisingly, the expression of NEUROD2 was affected 
by therapeutic intervention. Strikingly, also ZNF144, TEM7, PIP5K and PPP1R1B are expressed in neuronal cells, 
where they display diverse tissue specific functions. 

[0124] In addition Her-2/neu is often co-amplified with GRB7, a downstream member of the signaling cascade being 
involved in invasive properties of tumors. Surprisingly, we have found another member of the Her-2/neu signaling 

30 cascade being overexpressed in primary breast tumors TOB1 (= "Transducer of ERBB signaling"). Strong overexpres- 
sion of TOB1 corellated with weaker overexpression of Her-2/neu, already indicating its involvement in oncogenic 
signaling activities. Amplification of Her-2/neu has been assigned to enhanced proliferative capacity, due to the iden- 
tified downstream components of the signaling cascade (e.g. Ras-Raf-MAPK). In this respect it was surprising that 
some cdc genes, which are cell cycle dependent kinases, are part of the amplicons, which upon altered expression 

35 have great impact on cell cycle progression. 

[0125] According to the observations described above the following examples of genes at 3q21-26 are offered by 
way of illustration, not by way of limitation. 

-> WNT5A, CACNA1D, THRB, RARB, TOP2B, RAB5B, SMARCC1 (BAF155), RAF, WNT7A 

[0126] The following examples of genes at 12q13 are offered by way of illustration, not by way of limitation. 

-> CACNB3, Keratins, NR4A1, RAB5/13, RARgamma, STAT6, WNT10B, (GCN5), (SAS: Sarcoma Amplified Se- 
quence), SMARCC2 (BAF170), SMARCD1 (BAF60A), (GAS41: Glioma Amplified Sequence), (CHOP), Her3, 
45 KRTHB, HOX C , IGFBP6, WNT5B 

[0127] There is cross-talk between the amplified ARCHEONs described above and some other highly amplified 
genomic regions locate approximately at 1p13, 1q32, 2p16, 2q21, 3p12, 5p13,6p12, 7p12, 7q21,8q23, 11q13, 13q12, 
19q13, 20q13 and 21q11. The above mentioned chromosomal regions are described byway of illustration not byway 

50 of limitation, as the amplified regions often span larger and/or overlapping positions at these chromosomal positions. 
[0128] Additional alterations of non-transcribed genes, pseudogenes or intergenic regions of said chromosomal lo- 
cations can be measured for prediction, diagnosis, prognosis, prevention and treatment of malignant neoplasia and 
breast cancer in particular. Some of the genes or genomic regions have no direct influence on the members of the 
ARCHEONs or the genes within distinct chromosomal regions but still retain marker gene function due to their chro- 

55 mosomal positioning in the neighborhood of functionally critical genes (e.g. Telethonin neighboring the Her-2/neu gene). 
[0129] The invention further relates to the use of: 

a) a polynucleotide comprising at least one of the sequences of SEQ ID NO: 1 to 26 or 53 to 75; 
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b) a polynucleotide which hybridizes under stringent conditions to a polynucleotide specified in (a) encoding a 
polypeptide exhibiting the same biological function as specified for the respective sequence in Table 2 or 3 

c) a polynucleotide the sequence of which deviates from the polynucleotide specified in (a) and (b) due to the 
5 generation of the genetic code encoding a polypeptide exhibiting the same biological function as specified for the 

respective sequence in Table 2 or 3 

d) a polynucleotide which represents a specific fragment, derivative or allelic variation of a polynucleotide sequence 
specified in (a) to (c) 

w 

e) an antisense molecule targeting specifically one of the polynucleotide sequences specified in (a) to (d); 

f) a purified polypeptide encoded by a polynucleotide sequence specified in (a) to (d) 

15 g) a purified polypeptide comprising at least one of the sequences of SEQ ID NO: 27 to 52 or 76 to 98; 

h) an antibody capable of binding to one of the polynucleotide specified in (a) to (d) or a polypeptide specified in 
(f) and (g) 

20 i) a reagent identified by any of the methods of claim 14 to 1 6 that modulates the amount or activity of a polynu- 

cleotide sequence specified in (a) to (d) or a polypeptide specified in (f) and (g) 

in the preparation of a composition for the prevention, prediction, diagnosis, prognosis or a medicament for the treat- 
ment of malignant neoplasia and breast cancer in particular. 

25 

Polynucleotides 

[0130] A "BREAST CANCER GENE" polynucleotide can be single- or double-stranded and comprises a coding se- 
quence or the complement of a coding sequence for a "BREAST CANCER GENE" polypeptide. Degenerate nucleotide 

30 sequences encoding human "BREAST CANCER GENE" polypeptides, as well as homologous nucleotide sequences 
which are at least about 50, 55, 60, 65, 70, preferably about 75, 90, 96, or 98% identical to the nucleotide sequences 
of SEQ ID NO: 1 to 26or 53 to 75 also are "BREAST CANCER GENE" polynucleotides. Percent sequence identity 
between the sequences of two polynucleotides is determined using computer programs such as ALIGN which employ 
the FASTA algorithm, using an affine gap search with a gap open penalty of -12 and a gap extension penalty of -2. 

35 Complementary DNA (cDNA) molecules, species homologues, and variants of "BREAST CANCER GENE" polynucle- 
otides which encode biologically active "BREAST CANCER GENE" polypeptides also are "BREAST CANCER GENE" 
polynucleotides. 

Preparation of Polynucleotides 

[0131] A naturally occurring "BREAST CANCER GENE" polynucleotide can be isolated free of other cellular com- 
ponents such as membrane components, proteins, and lipids. Polynucleotides can be made by a cell and isolated 
using standard nucleic acid purification techniques, or synthesized using an amplification technique, such as the 
polymerase chain reaction (PCR), or by using an automatic synthesizer. Methods for isolating polynucleotides are 
45 routine and are known in the art. Any such technique for obtaining a polynucleotide can be used to obtain isolated 
"BREAST CANCER GENE" polynucleotides. For example, restriction enzymes and probes can be used to isolate 
polynucleotide fragments which comprises "BREAST CANCER GENE" nucleotide sequences. Isolated polynucle- 
otides are in preparations which are free or at least 70, 80, or 90% free of other molecules. 

[0132] "BREAST CANCER GENE" cDNA molecules can be made with standard molecular biology techniques, using 
so "BREAST CANCER GENE" mRNA as a template. Any RNA isolation technique which does not select against the 
isolation of mRNA may be utilized for the purification of such RNA samples. See, for example, Sambrook et al., 1989, 
(77); and Ausubel, F. M. et al., 1989, (78), both of which are incorporated herein by reference in their entirety. Addi- 
tionally, large numbers of tissue samples may readily be processed using techniques well known to those of skill in 
the art, such as, for example, the single-step RNA isolation process of Chomczynski, P. (1 989, U.S. Pat. No. 4,843, 1 55), 
55 which is incorporated herein by reference in its entirety. 

[0133] "BREAST CANCER GENE" cDNA molecules can thereafter be replicated using molecular biology techniques 
known in the art and disclosed in manuals such as Sambrook et al., 1989, (77) . An amplification technique, such as 
PCR, can be used to obtain additional copies of polynucleotides of the invention, using either human genomic DNA or 
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cDNA as a template. 

[0134] Alternatively, synthetic chemistry techniques can be used to synthesizes "BREAST CANCER GENE" poly- 
nucleotides. The degeneracy of the genetic code allows alternate nucleotide sequences to be synthesized which will 
encode a "BREAST CANCER GENE" polypeptide or a biologically active variant thereof. 

5 

Identification of differential expression 

[0135] Transcripts within the collected RNA samples which represent RNA produced by differentially expressed 
genes may be identified by utilizing a variety of methods which are ell known to those of skill in the art. For example, 
10 differential screening [Tedder, T. F. et al., 1988, (79)], subtractive hybridization [Hedrick, S. M. et al., 1984, (80); Lee, 
S. W. et al., 1984, (81)], and, preferably, differential display (Liang, P., and Pardee, A. B., 1993, U.S. Pat. No. 5,262,311 , 
which is incorporated herein by reference in its entirety), may be utilized to identify polynucleotide sequences derived 
from genes that are differentially expressed. 

[0136] Differential screening involves the duplicate screening of a cDNA library in which one copy of the library is 
15 screened with a total cell cDNA probe corresponding to the mRNA population of one cell type while a duplicate copy 
of the cDNA library is screened with a total cDNA probe corresponding to the mRNA population of a second cell type. 
For example, one cDNA probe may correspond to a total cell cDNA probe of a cell type derived from a control subject, 
while the second cDNA probe may correspond to a total cell cDNA probe of the same cell type derived from an exper- 
imental subject. Those clones which hybridize to one probe but not to the other potentially represent clones derived 
20 from genes differentially expressed in the cell type of interest in control versus experimental subjects. 

[0137] Subtractive hybridization techniques generally involve the isolation of mRNA taken from two different sources, 
e.g., control and experimental tissue, the hybridization of the mRNA or single-stranded cDNA reverse-transcribed from 
the isolated mRNA, and the removal of all hybridized, and therefore double-stranded, sequences. The remaining non- 
hybridized, single-stranded cDNAs, potentially represent clones derived from genes that are differentially expressed 
25 in the two mRNA sources. Such single-stranded cDNAs are then used as the starting material for the construction of 
a library comprising clones derived from differentially expressed genes. 

[0138] The differential display technique describes a procedure, utilizing the well known polymerase chain reaction 
(PCR; the experimental embodiment set forth in Mullis, K. B., 1987, U.S. Pat. No. 4,683,202) which allows for the 
identification of sequences derived from genes which are differentially expressed. First, isolated RNA is reverse-tran- 

30 scribed into single-stranded cDNA, utilizing standard techniques which are well known to those of skill in the art. Primers 
for the reverse transcriptase reaction may include, but are not limited to, oligo dT-containing primers, preferably of the 
reverse primer type of oligonucleotide described below. Next, this technique uses pairs of PCR primers, as described 
below, which allow for the amplification of clones representing a random subset of the RNA transcripts present within 
any given cell. Utilizing different pairs of primers allows each of the mRNA transcripts present in a cell to be amplified. 

35 Among such amplified transcripts may be identified those which have been produced from differentially expressed 
genes. 

[0139] The reverse oligonucleotide primer of the primer pairs may contain an oligo dT stretch of nucleotides, pref- 
erably eleven nucleotides long, at its 5' end, which hybridizes to the poly(A) tail of mRNA or to the complement of a 
cDNA reverse transcribed from an mRNApoly(A) tail. Second, in order to increase the specificity of the reverse primer, 

40 the primer may contain one or more, preferably two, additional nucleotides at its 3' end. Because, statistically, only a 
subset of the mRNA derived sequences present in the sample of interest will hybridize to such primers, the additional 
nucleotides allow the primers to amplify only a subset of the m RNA derived sequences present in the sample of interest. 
This is preferred in that it allows more accurate and complete visualization and characterization of each of the bands 
representing amplified sequences. 

45 [0140] The forward primer may contain a nucleotide sequence expected, statistically, to have the ability to hybridize 
to cDNA sequences derived from the tissues of interest. The nucleotide sequence may be an arbitrary one, and the 
length of the forward oligonucleotide primer may range from about 9 to about 13 nucleotides, with about 10 nucleotides 
being preferred. Arbitrary primer sequences cause the lengths of the amplified partial cDNAs produced to be variable, 
thus allowing different clones to be separated by using standard denaturing sequencing gel electrophoresis. PCR 

50 reaction conditions should be chosen which optimize amplified product yield and specificity, and, additionally, produce 
amplified products of lengths which may be resolved utilizing standard gel electrophoresis techniques. Such reaction 
conditions are well known to those of skill in the art, and important reaction parameters include, for example, length 
and nucleotide sequence of oligonucleotide primers as discussed above, and annealing and elongation step temper- 
atures and reaction times. The pattern of clones resulting from the reverse transcription and amplification of the mRNA 

55 of two different cell types is displayed via sequencing gel electrophoresis and compared. Differences in the two banding 
patterns indicate potentially differentially expressed genes. 

[0141] When screening for full-length cDNAs, it is preferable to use libraries that have been size-selected to include 
larger cDNAs. Randomly-primed libraries are preferable, in that they will contain more sequences which contain the 
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5' regions of genes. Use of a randomly primed library may be especially preferable for situations in which an oligo d 
(T) library does not yield a full-length cDNA. Genomic libraries can be useful for extension of sequence into 5' non- 
transcribed regulatory regions. 

[0142] Commercially available capillary electrophoresis systems can be used to analyze the size or confirm the 
5 nucleotide sequence of PCR or sequencing products. For example, capillary sequencing can employ flowable polymers 
for electrophoretic separation, four different fluorescent dyes (one for each nucleotide) which are laser activated, and 
detection of the emitted wavelengths by a charge coupled device camera. Output/light intensity can be converted to 
electrical signal using appropriate software (e.g. GENOTYPER and Sequence NAVIGATOR, Perkin Elmer; ABI), and 
the entire process from loading of samples to computer analysis and electronic data display can be computer controlled. 
10 Capillary electrophoresis is especially preferable for the sequencing of small pieces of DNA which might be present in 
limited amounts in a particular sample. 

[0143] Once potentially differentially expressed gene sequences have been identified via bulk techniques such as, 
for example, those described above, the differential expression of such putatively differentially expressed genes should 
be corroborated. Corroboration may be accomplished via, for example, such well known techniques as Northern anal- 
15 ysis and/or RT-PCR. Upon corroboration, the differentially expressed genes may be further characterized, and may 
be identified as target and/or marker genes, as discussed, below. 

[0144] Also, amplified sequences of differentially expressed genes obtained through, for example, differential display 
may be used to isolate full length clones of the corresponding gene. The full length coding portion of the gene may 
readily be isolated, without undue experimentation, by molecular biological techniques well known in the art. For ex- 
20 ample, the isolated differentially expressed amplified fragment may be labeled and used to screen a cDNA library. 
Alternatively, the labeled fragment may be used to screen a genomic library. 

[0145] An analysis of the tissue distribution of the mRNA produced by the identified genes may be conducted, utilizing 
standard techniques well known to those of skill in the art. Such techniques may include, for example, Northern analyses 
and RT-PCR. Such analyses provide information as to whether the identified genes are expressed in tissues expected 
25 to contribute to breast cancer. Such analyses may also provide quantitative information regarding steady state mRNA 
regulation, yielding data concerning which of the identified genes exhibits a high level of regulation in, preferably, 
tissues which may be expected to contribute to breast cancer. 

[0146] Such analyses may also be performed on an isolated cell population of a particular cell type derived from a 
given tissue. Additionally, standard in situ hybridization techniques may be utilized to provide information regarding 
30 which cells within a given tissue express the identified gene. Such analyses may provide information regarding the 
biological function of an identified gene relative to breast cancer in instances wherein only a subset of the cells within 
the tissue is thought to be relevant to breast cancer. 

Identification of co-amplified genes 

35 

[0147] Genes involved in genomic alterations (amplifications, insertions, translocations, deletions, etc.) are identified 
by PCR-based karyotyping in combination with database analysis. Of particular interest are gene amplifications, which 
account for gene copy numbers >2 per cell. Gene copy number and gene expression of the respective genes often 
correlates. Therefore clusters of genes being simultaneously overexpressed due to gene amplifications can be iden- 

40 tified by expression analysis via DNA-chip technologies or quantitative RTPCR. For example, the altered expression 
of genes due to increased or decreased gene copy numbers can be determined by GeneArray™ technologies from 
Affymetrix or qRT-PCR with the TaqMan or iCycler Systems. Moreover combination of RNA with DNA analytic enables 
highly parallel and automated characterization of multiple genomic regions of variable length with high resolution in 
tissue or single cell samples. Furthermore these assays enable the correlation of gene transcription relative to gene 

45 copy number of target genes. As there is not necessarily a linear correlation of expression level and gene copy number 
and as there are synergistic or antagonistic effects in certain gene clusters, the identification on the RNA-level is easier 
and probably more relevant for the biological outcome of the alterations especially in tumor tissue. 

Detection of co-amplified genes in malignant neoplasia 

50 

[0148] Chromosomal changes are commonly detected by FISH (=Fluorescence-ln-Situ-Hybridization) and CGH 
(=Comparative Genomic Hybridization). For quantification of genomic regions genes or intergenic regions can be used. 
Such quantification measures the relative abundance of multiple genes with respect to each other (e.g. target gene 
vs. centromeric region or housekeeping genes). Changes in relative abundance can be detected in paraffin-embedded 
55 material even after extraction of RNA or genomic DNA. Measurement of genomic DNA has advantages compared to 
RNA-analysis due to the stability of DNA, which accounts for the possibility to perform also retrospective studies and 
offers multiple internal controls (genes not being altered, amplified or deleted) for standardization and exact calcula- 
tions. Moreover, PCR-analysis of genomic DNA offers the advantage to investigate intergenic, highly variable regions 
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or combinations of SNP's (=Single Nucleotide Polymorphisms), RFLPs, VNTRs and STRs (in general polypmorphic 
markers). Determination of SNPs or polypmorphic markers within defined genomic regions (e.g. SNP analysis by "Py- 
rosequencing ™ ") has impact on the phenotype of the genomic alterations. For example it is of advantage to determine 
combinations of polymorphisms or haplotypes in order to characterize the biological potential of genes being part of 

5 amplified alleles. Of particular interest are polypmorphic markers in breakpoint regions, coding regions or regulatory 
regions of genes or intergenic regions. By determining predictive haplotypes with defined biological or clinical outcome 
it is possible to establish diagnostic and prognostic assays with non-tumor samples from patients. Depending on wheth- 
er preferably one allele or both alleles to same extent are amplified (= linear or non-linear amplifications) haplotypes 
can be determined. Overrepresentation of specific polypmorphic markers combinations in cells or tissues with gene 

10 amplifications facilitates haplotype determination, as e.g. combinations of heterozygous polypmorphic markers in nu- 
cleic acids isolated from normal tissues, body fluids or biological samples of one patient become almost homozygous 
in neoplastic tissue of the very same patient. This "gain of homozygosity" corresponds to the measurement of altered 
genomic region due to amplification events and is suitable for identification of "gain of function"- alterations in tumors, 
which result in e.g. oncogenic or growth promoting activities. In contrast, the detection of "losses of heterozygosity" is 

15 used for identification of anti-oncogenes, gate keeper genes or checkpoint genes, that suppress oncogenic activities 
and negatively regulate cellular growth processes. This intrinsic difference clearly opposes the impact of the respective 
genomic regions for tumor development and emphasizes the significance of "gain of homozygosity" measurements 
disclosed in this invention. In addition to the analyses on SNPs, a comparative approach of blood leucocyte DNA and 
tumor DNA based on VNTR detection can reveal the existance of a formerely described ARCHEON. SNP and VNTR 

20 sequences and primer sets most suitable for detection of theARCHEON at 1 7q 1 1 -2 1 are disclosed in Table4 and Table 
6. Detection, quantification and sizing of such polymorphic markers can be achieved by methods known to those with 
skill in the art. In one embodiment of this invention we disclose the comparative measurement of amount and size of 
any of the disclosed VNTRs (Table 6) by PCR amplification and capillary electrophoresis. PCR can be carried out by 
standart protocols favorably in a linear amplification range (low cycle number) and detection by CE should be carried 

25 out by suppliers protocols (e.g. Agilent). More favorably the detection of the VNTRs disclosed in Table 6 can be carried 
out in a multiplex fashion, utilizing a variety of labeled primers (e.g. fluoreszent, radioactive, bioactive) and a suitable 
CE detection system (e.g. ABI 310). However the detection can also be performed on slab gels consiting of highly 
concentrated agarose or polyacrylamide with a monochromal DNA stain. Enhancement of resolution can be achieved 
by appropriate primer design and length variation to give best results in multiplex PCR. 

30 [0149] It is also of interest to determine covalent modifications of DNA (e.g. methylation) or the associated chromatin 
(e.g. acetylation or methylation of associated proteins) within the altered genomic regions, that have impact on tran- 
scriptional activity of the genes. In general, by measuring multiple, short sequences (60-300 bp) these techniques 
enable high-resolution analysis of target regions, which cannot be obtained by conventional methods such as FISH 
analytic (2-100 kb). Moreover the PCR-based DNA analysis techniques offer advantages with regard to sensitivity, 

35 specificity, multiplexing, time consumption and low amount of patient material required. These techniques can be op- 
timized by combination with microdissection or macrodissection to obtain purer starting material for analysis. 

Extending Polynucleotides 

40 [0150] In one embodiment of such a procedure for the identification and cloning of full length gene sequences, RNA 
may be isolated, following standard procedures, from an appropriate tissue or cellular source. A reverse transcription 
reaction may then be performed on the RNA using an oligonucleotide primer complimentary to the mRNA that corre- 
sponds to the amplified fragment, for the priming of first strand synthesis. Because the primer is anti-parallel to the 
mRNA, extension will proceed toward the 5' end of the mRNA. The resulting RNA hybrid may then be "tailed" with 

45 guanines using a standard terminal transferase reaction, the hybrid may be digested with RNase H, and second strand 
synthesis may then be primed with a poly-C primer. Using the two primers, the 5' portion of the gene is amplified using 
PCR. Sequences obtained may then be isolated and recombined with previously isolated sequences to generate a 
full-length cDNA of the differentially expressed genes of the invention. For a review of cloning strategies and recom- 
binant DNA techniques, see e.g., Sambrook et al., (77); and Ausubel et al., (78). 

50 [0151] Various PCR-based methods can be used to extend the polynucleotide sequences disclosed herein to detect 
upstream sequences such as promoters and regulatory elements. For example, restriction site PCR uses universal 
primers to retrieve unknown sequence adjacent to a known locus [Sarkar, 1993, (82)]. Genomic DNA is first amplified 
in the presence of a primer to a linker sequence and a primer specific to the known region. The amplified sequences 
are then subjected to a second round of PCR with the same linker primer and another specific primer internal to the 

55 first one. Products of each round of PCR are transcribed with an appropriate RNA polymerase and sequenced using 
reverse transcriptase. 

[0152] Inverse PCR also can be used to amplify or extend sequences using divergent primers based on a known 
region [Triglia et al., 1988 ,(83)]. Primers can be designed using commercially available software, such as OLIGO4.06 



28 



EP 1 365 034 A2 



Primer Analysis software (National Biosciences Inc., Plymouth, Minn.), to be e.g. 2230 nucleotides in length, to have 
a GC content of 50% or more, and to anneal to the target sequence at temperatures about 68-72°C. The method uses 
several restriction enzymes to generate a suitable fragment in the known region of a gene. The fragment is then cir- 
cularized by intramolecular ligation and used as a PCR template. 
5 [0153] Another method which can be used is capture PCR, which involves PCR amplification of DNA fragments 
adjacent to a known sequence in human and yeast artificial chromosome DNA [Lagerstrom et al., 1991, (84)]. In this 
method, multiple restriction enzyme digestions and ligations also can be used to place an engineered double-stranded 
sequence into an unknown fragment of the DNA molecule before performing PCR. 

[0154] Additionally, PCR, nested primers, and PROMOTERFINDER libraries (CLONTECH, Palo Alto, Calif.) can be 
10 used to walk genomic DNA (CLONTECH, Palo Alto, Calif.). This process avoids the need to screen libraries and is 
useful in finding intron/exon junctions. 

[0155] The sequences of the identified genes may be used, utilizing standard techniques, to place the genes onto 
genetic maps, e.g., mouse [Copeland & Jenkins, 1 991 , (85)] and human genetic maps [Cohen, et al., 1 993 ,(86)]. Such 
mapping information may yield information regarding the genes' importance to human disease by, for example, iden- 
15 tifying genes which map near genetic regions to which known genetic breast cancer tendencies map. 

Identification of polynucleotide variants and homologues or splice variants 

[0156] Variants and homologues of the "BREAST CANCER GENE" polynucleotides described above also are 

20 "BREAST CANCER GENE" polynucleotides. Typically, homologous "BREAST CANCER GENE" polynucleotide se- 
quences can be identified by hybridization of candidate polynucleotides to known "BREAST CANCER GENE" polynu- 
cleotides under stringent conditions, as is known in the art. For example, using the following wash conditions: 2 X SSC 
(0.3 M NaCI, 0.03 M sodium citrate, pH 7.0), 0.1% SDS, room temperature twice, 30 minutes each; then 2 X SSC, 
0.1% SDS, 50 EC once, 30 minutes; then 2 X SSC, room temperature twice, 10 minutes each homologous sequences 

25 can be identified which contain at most about 25-30% basepair mismatches. More preferably, homologous polynucle- 
otide strands contain 15-25% basepair mismatches, even more preferably 5-15% basepair mismatches. 
[0157] Species homologues of the "BREAST CANCER GENE" polynucleotides disclosed herein also can be iden- 
tified by making suitable probes or primers and screening cDNA expression libraries from other species, such as mice, 
monkeys, or yeast. Human variants of "BREAST CANCER GENE" polynucleotides can be identified, for example, by 

30 screening human cDNA expression libraries. It is well known that the T m of a double-stranded DNA decreases by 
1 -1 .5°C with every 1 % decrease in homology [Bonner et al., 1 973, (87)]. Variants of human "BREAST CANCER GENE" 
polynucleotides or "BREAST CANCER GENE" polynucleotides of other species can therefore be identified by hybrid- 
izing a putative homologous "BREAST CANCER GENE" polynucleotide with a polynucleotide having a nucleotide 
sequence of one of the sequences of the SEQ ID NO: 1 to 26 or 53 to 75 or the complement thereof to form a test 

35 hybrid. The melting temperature of the test hybrid is compared with the melting temperature of a hybrid comprising 
polynucleotides having perfectly complementary nucleotide sequences, and the number or percent of basepair mis- 
matches within the test hybrid is calculated. 

[0158] Nucleotide sequences which hybridize to "BREAST CANCER GENE" polynucleotides or their complements 
following stringent hybridization and/or wash conditions also are "BREAST CANCER GENE" polynucleotides. Stringent 

40 wash conditions are well known and understood in the art and are disclosed, for example, in Sambrook et al., (77). 
Typically, for stringent hybridization conditions a combination of temperature and salt concentration should be chosen 
that is approximately 1 2-20°C below the calculated T m of the hybrid under study. The T m of a hybrid between a "BREAST 
CANCER GENE" polynucleotide having a nucleotide sequence of one of the sequences of the SEQ ID NO: 1 to 26 or 
53 to 75 or the complement thereof and a polynucleotide sequence which is at least about 50, preferably about 75, 

45 go, 96, or 98% identical to one of those nucleotide sequences can be calculated, for example, using the equation below 
[Bolton and McCarthy, 1962, (88): 

T m = 81 .5°C - 16.6(log 10 [Na + ]) + 0.41(%G + C) - 0.63(%formamide) - 600/1 ), 

50 

where 1 = the length of the hybrid in basepairs. 
[0159] Stringent wash conditions include, for example, 4 X SSC at 65°C, or 50% formamide, 4 X SSC at 28°C, or 
0.5 X SSC, 0.1% SDS at 65°C. Highly stringent wash conditions include, for example, 0.2 X SSC at 65°C 
[0160] The biological function of the identified genes may be more directly assessed by utilizing relevant in vivo and 
55 in vitro systems. In vivo systems may include, but are not limited to, animal systems which naturally exhibit breast 
cancer predisposition, or ones which have been engineered to exhibit such symptoms, including but not limited to the 
apoE-deficient malignant neoplasia mouse model [Plump et al., 1992, (89)]. 

[0161] Splice variants derived from the same genomic region, encoded by the same pre mRNA can be identified by 
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hybridization conditions described above for homology search. The specific characteristics of variant proteins encoded 
by splice variants of the same pre transcript may differ and can also be assayed as disclosed. A "BREAST CANCER 
GENE" polynucleotide having a nucleotide sequence of one of the sequences of the SEQ ID NO: 1 to 26 or 53 to 75 
or the complement thereof may therefor differ in parts of the entire sequence as presented for SEQ ID NO: 60 and the 
5 encoded splice variants SEQ ID NO: 61 to 66. These refer to individual proteins SEQ ID NO: 83 to 89. The prediction 
of splicing events and the identification of the utilized acceptor and donor sites within the pre mRNA can be computed 
(e.g. Software Package GRAIL or GenomeSCAN) and verified by PCR method by those with skill in the art. 

Antisense oligonucleotides 

w 

[0162] Antisense oligonucleotides are nucleotide sequences which are complementary to a specific DNA or RNA 
sequence. Once introduced into a cell, the complementary nucleotides combine with natural sequences produced by 
the cell to form complexes and block either transcription or translation. Preferably, an antisense oligonucleotide is at 
least 6 nucleotides in length, but can be at least 7, 8, 10, 12, 15, 20, 25, 30, 35,40, 45, or 50 or more nucleotides long. 

15 Longer sequences also can be used. Antisense oligonucleotide molecules can be provided in a DNA construct and 
introduced into a cell as described above to decrease the level of "BREAST CANCER GENE" gene products in the cell. 
[0163] Antisense oligonucleotides can be deoxyribonucleotides, ribonucleotides, peptide nucleic acids (PNAs; de- 
scribed in U.S. Pat. No. 5,714,331), locked nucleic acids (LNAs; described in WO 99/12826), or a combination of them. 
Oligonucleotides can be synthesized manually or by an automated synthesizer, by covalently linking the 5' end of one 

20 nucleotide with the 3' end of another nucleotide with non-phosphodiester internucleotide linkages such alkylphospho- 
nates, phosphorothioates, phosphorodithioates, alkylphosphonothioates, alkylphosphonates, phosphoramidates, 
phosphate esters, carbamates, acetamidate, carboxymethyl esters, carbonates, and phosphate triesters[Brown, 1994, 
(126); Sonveaux, 1994, (127) and Uhlmann etal., 1990, (128)]. 

[0164] Modifications of "BREAST CANCER GENE" expression can be obtained by designing antisense oligonucle- 
25 otides which will form duplexes to the control, 5', or regulatory regions of the "BREAST CANCER GENE". Oligonucle- 
otides derived from the transcription initiation site, e.g., between positions 10 and +10 from the start site, are preferred. 
Similarly, inhibition can be achieved using "triple helix" base-pairing methodology. Triple helix pairing is useful because 
it causes inhibition of the ability of the double helix to open sufficiently for the binding of polymerases, transcription 
factors, or chaperons. Therapeutic advances using triplex DNA have been described in the literature [Gee et al., 1994, 
30 (129)]. An antisense oligonucleotide also can be designed to block translation of mRNA by preventing the transcript 
from binding to ribosomes. 

[0165] Precise complementarity is not required for successful complex formation between an antisense oligonucle- 
otide and the complementary sequence of a "BREAST CANCER GENE" polynucleotide. Antisense oligonucleotides 
which comprise, for example, 2, 3, 4, or 5 or more stretches of contiguous nucleotides which are precisely comple- 

35 mentary to a "BREAST CANCER GENE" polynucleotide, each separated by a stretch of contiguous nucleotides which 
are not complementary to adjacent "BREAST CANCER GENE" nucleotides, can provide sufficient targeting specificity 
for "BREAST CANCER GENE" mRNA. Preferably, each stretch of complementary contiguous nucleotides is at least 
4, 5, 6, 7, or 8 or more nucleotides in length. Non-complementary intervening sequences are preferably 1 , 2, 3, or 4 
nucleotides in length. One skilled in the art can easily use the calculated melting point of an antisense-sense pair to 

40 determine the degree of mismatching which will be tolerated between a particular antisense oligonucleotide and a 
particular "BREAST CANCER GENE" polynucleotide sequence. 

[0166] Antisense oligonucleotides can be modified without affecting their ability to hybridize to a "BREAST CANCER 
GENE" polynucleotide. These modifications can be internal or at one or both ends of the antisense molecule. For 
example, inter-nucleoside phosphate linkages can be modified by adding cholesteryl or diamine moieties with varying 
45 numbers of carbon residues between the amino groups and terminal ribose. Modified bases and/or sugars, such as 
arabinose instead of ribose, or a 3', 5' substituted oligonucleotide in which the 3' hydroxyl group or the 5' phosphate 
group are substituted, also can be employed in a modified antisense oligonucleotide. These modified oligonucleotides 
can be prepared by methods well known in the art[ art[ Agrawal et al., 1992, (130); Uhlmann et al., 1987, (131) and 
Uhlmann et al., (128)]. 

50 

Ribozymes 

[0167] Ribozymes are RNA molecules with catalytic activity [Cech, 1987, (132); Cech, 1990, (133) and Couture & 
Stinchcomb, 1996, (134)]. Ribozymes can be used to inhibit gene function by cleaving an RNA sequence, as is known 
55 in the art (e.g., Haseloff et al., U.S. Patent 5,641 ,673). The mechanism of ribozyme action involves sequence-specific 
hybridization of the ribozyme molecule to complementary target RNA, followed by endonucleolytic cleavage. Examples 
include engineered hammerhead motif ribozyme molecules that can specifically and efficiently catalyze endonucleolytic 
cleavage of specific nucleotide sequences. 
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[0168] The transcribed sequence of a "BREAST CANCER GENE" can be used to generate ribozymes which will 
specifically bind to mRNA transcribed from a "BREAST CANCER GENE" genomic locus. Methods of designing and 
constructing ribozymes which can cleave other RNA molecules in trans in a highly sequence specific manner have 
been developed and described in the art [Haseloff etal., 1988, (135)]. For example, the cleavage activity of ribozymes 
5 can be targeted to specific RNAs by engineering a discrete "hybridization" region into the ribozyme. The hybridization 
region contains a sequence complementary to the target RNA and thus specifically hybridizes with the target [see, for 
example, Gerlach et al., EP 0 321201]. 

[0169] Specific ribozyme cleavage sites within a "BREAST CANCER GENE" RNA target can be identified by scan- 
ning the target molecule for ribozyme cleavage sites which include the following sequences: GUA, GUU, and GUC. 

10 Once identified, short RNA sequences of between 1 5 and 20 ribonucleotides corresponding to the region of the target 
RNA containing the cleavage site can be evaluated for secondary structural features which may render the target 
inoperable. Suitability of candidate "BREAST CANCER GENE" RNA targets also can be evaluated by testing acces- 
sibility to hybridization with complementary oligonucleotides using ribonuclease protection assays. Longer comple- 
mentary sequences can be used to increase the affinity of the hybridization sequence for the target. The hybridizing 

15 and cleavage regions of the ribozyme can be integrally related such that upon hybridizing to the target RNA through 
the complementary regions, the catalytic region of the ribozyme can cleave the target. 

[0170] Ribozymes can be introduced into cells as part of a DNA construct. Mechanical methods, such as microin- 
jection, liposome-mediated transfection, electroporation, or calcium phosphate precipitation, can be used to introduce 
a ribozyme-containing DNA construct into cells in which it is desired to decrease "BREAST CANCER GENE" expres- 

20 sion. Alternatively, if it is desired that the cells stably retain the DNA construct, the construct can be supplied on a 
plasmid and maintained as a separate element or integrated into the genome of the cells, as is known in the art. A 
ribozyme-encoding DNA construct can include transcriptional regulatory elements, such as a promoter element, an 
enhancer or UAS element, and a transcriptional terminator signal, for controlling transcription of ribozymes in the cells. 
[0171] As taught in Haseloff et al., U.S Pat. No. 5,641 ,673, ribozymes can be engineered so that ribozyme expression 

25 will occur in response to factors which induce expression of a target gene. Ribozymes also can be engineered to 
provide an additional level of regulation, so that destruction of mRNA occurs only when both a ribozyme and a target 
gene are induced in the cells. 

Polypeptides 

[0172] "BREAST CANCER GENE" polypeptides according to the invention comprise an polypeptide selected from 
SEQ ID NO: 27 to 52 and 76 to 98 or encoded by any of the polynucleotide sequences of the SEQ ID NO: 1 to 26 and 
53 to 75 or derivatives, fragments, analogues and homologues thereof. A "BREAST CANCER GENE" polypeptide of 
the invention therefore can be a portion, a full-length, or a fusion protein comprising all or a portion of a "BREAST 
35 CANCER GENE" polypeptide. 

Protein Purification 

[0173] "BREAST CANCER GENE" polypeptides can be purified from any cell which expresses the enzyme, including 
40 host cells which have been transfected with "BREAST CANCER GENE" expression constructs. Breast tissue is an 
especially useful source of "BREAST CANCER GENE" polypeptides. A purified "BREAST CANCER GENE" polypep- 
tide is separated from other compounds which normally associate with the "BREAST CANCER GENE" polypeptide in 
the cell, such as certain proteins, carbohydrates, or lipids, using methods well-known in the art. Such methods include, 
but are not limited to, size exclusion chromatography, ammonium sulfate fractionation, ion exchange chromatography, 
45 affinity chromatography, and preparative gel electrophoresis. A preparation of purified "BREAST CANCER GENE" 
polypeptides is at least 80% pure; preferably, the preparations are 90%, 95%, or 99% pure. Purity of the preparations 
can be assessed by any means known in the art, such as SDS-polyacrylamide gel electrophoresis. 

Obtaining Polypeptides 

50 

[0174] "BREAST CANCER GENE" polypeptides can be obtained, for example, by purification from human cells, by 
expression of "BREAST CANCER GENE" polynucleotides, or by direct chemical synthesis. 

Biologically Active Variants 

55 

[0175] "BREAST CANCER GENE" polypeptide variants which are biologically active, i.e., retain an "BREAST CAN- 
CER GENE" activity, also are "BREAST CANCER GENE" polypeptides. Preferably, naturally or non-naturally occurring 
"BREAST CANCER GENE" polypeptide variants have amino acid sequences which are at least about 60, 65, or 70, 
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preferably about 75, 80, 85, 90, 92, 94, 96, or 98% identical to the any of the amino acid sequences of the polypeptides 
of SEQ ID NO: 27 to 52 or 76 to 98 or the polypeptides encoded by any of the polynucleotides of SEQ ID NO: 1 to 26 
or 53 to 75 or a fragment thereof. Percent identity between a putative "BREAST CANCER GENE" polypeptide variant 
and of the polypeptides of SEQ ID NO: 27 to 52 or 76 to 98 or the polypeptides encoded by any of the polynucleotides 
5 of SEQ ID NO: 1 to 26 or 53 to 75 or a fragment thereof is determined by conventional methods. [See, for example, 
Altschul et al., 1 986, (90 and Henikoff & Henikoff, 1 992, (91 )]. Briefly, two amino acid sequences are aligned to optimize 
the alignment scores using a gap opening penalty of 10, a gap extension penalty of 1 , and the "BLOSUM62" scoring 
matrix of Henikoff& Henikoff, (91). 

[0176] Those skilled in the art appreciate that there are many established algorithms available to align two amino 

10 acid sequences. The "FASTA" similarity search algorithm of Pearson & Lipman is a suitable protein alignment method 
for examining the level of identity shared by an amino acid sequence disclosed herein and the amino acid sequence 
of a putative variant [Pearson & Lipman, 1988, (92), and Pearson, 1990, (93)]. Briefly, FASTA first characterizes se- 
quence similarity by identifying regions shared by the query sequence (e.g., SEQ ID NO: 1 to 26 or 53 to 75) and a 
test sequence that have either the highest density of identities (if the ktup variable is 1 ) or pairs of identities (if ktup=2), 

15 without considering conservative amino acid substitutions, insertions, or deletions. The ten regions with the highest 
density of identities are then rescored by comparing the similarity of all paired amino acids using an amino acid sub- 
stitution matrix, and the ends of the regions are "trimmed" to include only those residues that contribute to the highest 
score. If there are several regions with scores greater than the "cutoff" value (calculated by a predetermined formula 
based upon the length of the sequence the ktup value), then the trimmed initial regions are examined to determine 

20 whether the regions can be joined to form an approximate alignment with gaps. Finally, the highest scoring regions of 
the two amino acid sequences are aligned using a modification of the Needleman-Wunsch-Sellers algorithm [Needle- 
man & Wunsch, 1970, (94), and Sellers, 1974, (95)], which allows for amino acid insertions and deletions. Preferred 
parameters for FASTA analysis are: ktup=1, gap opening penalty=10, gap extension penalty=1, and substitution ma- 
trix=BLOSUM62. These parameters can be introduced into a FASTA program by modifying the scoring matrix file 

25 ("SMATRIX"), as explained in Appendix 2 of Pearson, (93). 

[0177] FASTA can also be used to determine the sequence identity of nucleic acid molecules using a ratio as disclosed 
above. For nucleotide sequence comparisons, the ktup value can range between one to six, preferably from three to 
six, most preferably three, with other parameters set as default. 

[0178] Variations in percent identity can be due, for example, to amino acid substitutions, insertions, or deletions. 

30 Amino acid substitutions are defined as one for one amino acid replacements. They are conservative in nature when 
the substituted amino acid has similar structural and/or chemical properties. Examples of conservative replacements 
are substitution of a leucine with an isoleucine or valine, an aspartate with a glutamate, or a threonine with a serine. 
[0179] Amino acid insertions or deletions are changes to or within an amino acid sequence. They typically fall in the 
range of about 1 to 5 amino acids. Guidance in determining which amino acid residues can be substituted, inserted, 

35 or deleted without abolishing biological or immunological activity of a "BREAST CANCER GENE" polypeptide can be 
found using computer programs well known in the art, such as DNASTAR software. Whether an amino acid change 
results in a biologically active "BREAST CANCER GENE" polypeptide can readily be determined by assaying for 
"BREAST CANCER GENE" activity, as described for example, in the specific Examples, below. Larger insertions or 
deletions can also be caused by alternative splicing. Protein domains can be inserted or deleted without altering the 

40 main activity of the protein. 

Fusion Proteins 



[0180] Fusion proteins are useful for generating antibodies against "BREAST CANCER GENE" polypeptide amino 
45 acid sequences and for use in various assay systems. For example, fusion proteins can be used to identify proteins 
which interact with portions of a "BREAST CANCER GENE" polypeptide. Protein affinity chromatography or library- 
based assays for protein-protein interactions, such as the yeast two-hybrid or phage display systems, can be used for 
this purpose. Such methods are well known in the art and also can be used as drug screens. 
[0181] A "BREAST CANCER GENE" polypeptide fusion protein comprises two polypeptide segments fused together 
so by means of a peptide bond. The first polypeptide segment comprises at least 25, 50, 75, 100, 150, 200, 300, 400, 
500, 600, 700 or 750 contiguous amino acids of an amino acid sequence encoded by any polynucleotide sequences 
of the SEQ ID NO: 1 to 26 or 53 to 75 or of a biologically active variant, such as those described above. The first 
polypeptide segment also can comprise full-length "BREAST CANCER GENE". 

[0182] The second polypeptide segment can be a full-length protein or a protein fragment. Proteins commonly used 
55 in fusion protein construction include |3-galactosidase, (i-glucuronidase, green fluorescent protein (GFP), autofluores- 
cent proteins, including blue fluorescent protein (BFP), glutathione-S-transferase (GST), luciferase, horseradish per- 
oxidase (HRP), and chloramphenicol acetyltransferase (CAT). Additionally, epitope tags are used in fusion protein 
constructions, including histidine (His) tags, FLAG tags, influenza hemagglutinin (HA) tags, Myc tags, VSV-G tags, 
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and thioredoxin (Trx) tags. Other fusion constructions can include maltose binding protein (MBP), S-tag, Lex a DNA 
binding domain (DBD) fusions, GAL4 DNA binding domain fusions, and herpes simplex virus (HSV) BP16 protein 
fusions. A fusion protein also can be engineered to contain a cleavage site located between the "BREAST CANCER 
GENE" polypeptide-encoding sequence and the heterologous protein sequence, so that the "BREAST CANCER 

5 GENE" polypeptide can be cleaved and purified away from the heterologous moiety. 

[0183] A fusion protein can be synthesized chemically, as is known in the art. Preferably, a fusion protein is produced 
by covalently linking two polypeptide segments or by standard procedures in the art of molecular biology. Recombinant 
DNA methods can be used to prepare fusion proteins, for example, by making a DNA construct which comprises coding 
sequences selected from any of the polynucleotide sequences of the SEQ ID NO: 1 to 26 and 53 to 75 in proper reading 

10 frame with nucleotides encoding the second polypeptide segment and expressing the DNA construct in a host cell, as 
is known in the art. Many kits for constructing fusion proteins are available from companies such as Promega Corpo- 
ration (Madison, Wl), Stratagene (La Jolla, CA), CLONTECH (Mountain View, CA), Santa Cruz Biotechnology (Santa 
Cruz, CA), MBL International Corporation (MIC; Watertown, MA), and Quantum Biotechnologies (Montreal, Canada; 
1-888-DNA-KITS). 

15 

Identification of Species Homoloques 

[0184] Species homologues of human a "BREAST CANCER GENE" polypeptide can be obtained using "BREAST 
CANCER GENE" polypeptide polynucleotides (described below) to make suitable probes or primers for screening 
20 cDNA expression libraries from other species, such as mice, monkeys, or yeast, identifying cDNAs which encode 
homologues of a "BREAST CANCER GENE" polypeptide, and expressing the cDNAs as is known in the art. 

Expression of Polynucleotides 

25 [0185] To express a "BREAST CANCER GENE" polynucleotide, the polynucleotide can be inserted into an expres- 
sion vector which contains the necessary elements for the transcription and translation of the inserted coding sequence. 
Methods which are well known to those skilled in the art can be used to construct expression vectors containing se- 
quences encoding "BREAST CANCER GENE" polypeptides and appropriate transcriptional and translational control 
elements. These methods include in vitro recombinant DNA techniques, synthetic techniques, and in vivo genetic 

30 recombination. Such techniques are described, for example, in Sambrook et al., (77) and in Ausubel et al., (78). 

[0186] A variety of expression vector/host systems can be utilized to contain and express sequences encoding a 
"BREAST CANCER GENE" polypeptide. These include, but are not limited to, microorganisms, such as bacteria trans- 
formed with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with yeast 
expression vectors, insect cell systems infected with virus expression vectors (e.g., baculovirus), plant cell systems 

35 transformed with virus expression vectors (e.g., cauliflower mosaic virus, CaMV; tobacco mosaic virus, TMV) or with 
bacterial expression vectors (e.g., Ti or pBR322 plasmids), or animal cell systems. 

[0187] The control elements or regulatory sequences are those regions of the vector enhancers, promoters, 5' and 
3' untranslated regions which interact with host cellular protei ns to carry out transcription and translation. Such elements 
can vary in their strength and specificity. Depending on the vector system and host utilized, any number of suitable 

40 transcription and translation elements, including constitutive and inducible promoters, can be used. For example, when 
cloning in bacterial systems, inducible promoters such as the hybrid lacZ promoter of the BLUESCRIPT phagemid 
(Stratagene, LaJolla, Calif.) or pSPORTI plasmid (Life Technologies) and the like can be used. The baculovirus pol- 
yhedrin promoter can be used in insect cells. Promoters or enhancers derived from the genomes of plant cells (e.g., 
heat shock, RUBISCO, and storage protein genes) or from plant viruses (e.g., viral promoters or leader sequences) 

45 can be cloned into the vector. In mammalian cell systems, promoters from mammalian genes or from mammalian 
viruses are preferable. If it is necessary to generate a cell line that contains multiple copies of a nucleotide sequence 
encoding a "BREAST CANCER GENE" polypeptide, vectors based on SV40 or EBV can be used with an appropriate 
selectable marker. 

so Bacterial and Yeast Expression Systems 

[0188] In bacterial systems, a number of expression vectors can be selected depending upon the use intended for 
the "BREAST CANCER GENE" polypeptide. For example, when a large quantity of the "BREAST CANCER GENE" 
polypeptide is needed for the induction of antibodies, vectors which direct high level expression of fusion proteins that 
55 are readily purified can be used. Such vectors include, but are not limited to, multifunctional E. coli cloning and expres- 
sion vectors such as BLUESCRIPT (Stratagene). In a BLUESCRIPT vector, a sequence encoding the "BREAST CAN- 
CER GENE" polypeptide can be ligated into the vector in frame with sequences for the amino terminal Met and the 
subsequent 7 residues of |3-galactosidase so that a hybrid protein is produced. pIN vectors [Van Heeke & Schuster, 
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(17)] or pGEX vectors (Promega, Madison, Wis.) also can be used to express foreign polypeptides as fusion proteins 
with glutathione S-transferase (GST). 

[0189] In general, such fusion proteins are soluble and can easily be purified from lysed cells by adsorption to glu- 
tathione agarose beads followed by elution in the presence of free glutathione. Proteins made in such systems can be 
5 designed to include heparin, thrombin, or factor Xa protease cleavage sites so that the cloned polypeptide of interest 
can be released from the GST moiety at will. 

[0190] In the yeast Saccharomyces cerevisiae, a number of vectors containing constitutive or inducible promoters 
such as alpha factor, alcohol oxidase, and PGH can be used. For reviews, see Ausubel et al., (4) and Grant etal., (18). 

10 Plant and Insect Expression Systems 

[0191] If plant expression vectors are used, the expression of sequences encoding "BREAST CANCER GENE" 
polypeptides can be driven by any of a number of promoters. For example, viral promoters such as the 35S and 19S 
promoters of CaMV can be used alone or in combination with the omega leader sequence from TMV [Takamatsu, 
15 1 987, (96)]. Alternatively, plant promoters such as the small subunit of RUBISCO or heat shock promoters can be used 
[Coruzzi etal., 1984, (97); Broglie etal., 1984, (98); Winter etal., 1991, (99)]. These constructs can be introduced into 
plant cells by direct DNA transformation or by pathogen-mediated transfection. Such techniques are described in a 
number of generally available reviews. 

[0192] An insect system also can be used to express a "BREAST CANCER GENE" polypeptide. For example, in 
20 one such system Autographa californica nuclear polyhedrosis virus (AcNPV) is used as a vector to express foreign 
genes in Spodoptera frugiperda cells or in Trichoplusia larvae. Sequences encoding "BREAST CANCER GENE" 
polypeptides can be cloned into a nonessential region of the virus, such as the polyhedrin gene, and placed under 
control of the polyhedrin promoter. Successful insertion of "BREAST CANCER GENE" polypeptides will render the 
polyhedrin gene inactive and produce recombinant virus lacking coat protein. The recombinant viruses can then be 
25 used to infect S. frugiperda cells or Trichoplusia larvae in which "BREAST CANCER GENE" polypeptides can be 
expressed [Engelhard et al., 1994, (100)]. 

Mammalian Expression Systems 

30 [0193] A number of viral-based expression systems can be used to express "BREAST CANCER GENE" polypeptides 
in mammalian host cells. For example, if an adenovirus is used as an expression vector, sequences encoding "BREAST 
CANCER GENE" polypeptides can be ligated into an adenovirus transcription/translation complex comprising the late 
promoter and tripartite leader sequence. Insertion in a nonessential E1 or E3 region of the viral genome can be used 
to obtain a viable virus which is capable of expressing a "BREAST CANCER GENE" polypeptide in infected host cells 

35 [Logan & Shenk, 1984, (101)]. If desired, transcription enhancers, such as the Rous sarcoma virus (RSV) enhancer, 
can be used to increase expression in mammalian host cells. 

[0194] Human artificial chromosomes (HACs) also can be used to deliver larger fragments of DNA than can be 
contained and expressed in a plasmid. HACs of 6M to 10M are constructed and delivered to cells via conventional 
delivery methods (e.g., liposomes, polycationic amino polymers, or vesicles). 

40 [0195] Specific initiation signals also can be used to achieve more efficient translation of sequences encoding 
"BREAST CANCER GENE" polypeptides. Such signals include the ATG initiation codon and adjacent sequences. In 
cases where sequences encoding a "BREAST CANCER GENE" polypeptide, its initiation codon, and upstream se- 
quences are inserted into the appropriate expression vector, no additional transcriptional or translational control signals 
may be needed. However, in cases where only coding sequence, or a fragment thereof, is inserted, exogenous trans- 

45 lational control signals (including the ATG initiation codon) should be provided. The initiation codon should be in the 
correct reading frame to ensure translation of the entire insert. Exogenous translational elements and initiation codons 
can be of various origins, both natural and synthetic. The efficiency of expression can be enhanced by the inclusion 
of enhancers which are appropriate for the particular cell system which is used [Scharf et al., 1994, (102)]. 

so Host Cells 

[0196] A host cell strain can be chosen for its ability to modulate the expression of the inserted sequences or to 
process the expressed "BREAST CANCER GENE" polypeptide in the desired fashion. Such modifications of the 
polypeptide include, but are not limited to, acetylation, carboxylation, glycosylation, phosphorylation, lipidation, and 
55 acylation. Posttranslational processing which cleaves a "prepro" form of the polypeptide also can be used to facilitate 
correct insertion, folding and/or function. Different host cells which have specific cellular machinery and characteristic 
mechanisms for Post-translational activities (e.g., CHO, HeLa, MDCK, HEK293, and WI38), are available from the 
American Type Culture Collection (ATCC; 10801 University Boulevard, Manassas, VA 201 10-2209) and can be chosen 
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to ensure the correct modification and processing of the foreign protein. 

[0197] Stable expression is preferred for long-term, high-yield production of recombinant proteins. For example, cell 
lines which stably express "BREAST CANCER GENE" polypeptides can be transformed using expression vectors 
which can contain viral origins of replication and/or endogenous expression elements and a selectable marker gene 

5 on the same or on a separate vector. Following the introduction of the vector, cells can be allowed to grow for 12 days 
in an enriched medium before they are switched to a selective medium. The purpose of the selectable marker is to 
confer resistance to selection, and its presence allows growth and recovery of cells which successfully express the 
introduced "BREAST CANCER GENE" sequences. Resistant clones of stably transformed cells can be proliferated 
using tissue culture techniques appropriate to the cell type [Freshney et al., 1986, (103). 

10 [0198] Any number of selection systems can be used to recover transformed cell lines. These include, but are not 
limited to, the herpes simplex virus thymidine kinase (Wigleretal., 1977, (104)] and adenine phosphoribosyltransferase 
[Lowy et al., 1 980, (1 05)] genes which can be employed in tk- or aprt- cells, respectively. Also, antimetabolite, antibiotic, 
or herbicide resistance can be used as the basis for selection. For example, dhfr confers resistance to methotrexate 
[Wigler et al., 1980, (106)], npt confers resistance to the aminoglycosides, neomycin and G418 [Colbere-Garapin et 

15 al., 1981 , (107)], and als and pat confer resistance to chlorsulfuron and phosphinotricin acetyltransferase, respectively. 
Additional selectable genes have been described. For example, trpB allows cells to utilize indole in place of tryptophan, 
or hisD, which allows cells to utilize histinol in place of histidine [Hartman & Mulligan, 1988 ,(108)]. Visible markers 
such as anthocyanins, ^-glucuronidase and its substrate GUS, and luciferase and its substrate luciferin, can be used 
to identify transformants and to quantify the amount of transient or stable protein expression attributable to a specific 

20 vector system [Rhodes et al., 1995, (109)]. 

Detecting Expression and gene product 

[0199] Although the presence of marker gene expression suggests that the "BREAST CANCER GENE" polynucle- 

25 otide is also present, its presence and expression may need to be confirmed. For example, if a sequence encoding a 
"BREAST CANCER GENE" polypeptide is inserted within a marker gene sequence, transformed cells containing se- 
quences which encode a "BREAST CANCER GENE" polypeptide can be identified by the absence of marker gene 
function. Alternatively, a marker gene can be placed in tandem with a sequence encoding a "BREAST CANCER GENE" 
polypeptide under the control of a single promoter. Expression of the marker gene in response to induction or selection 

30 usually indicates expression of the "BREAST CANCER GENE" polynucleotide. 

[0200] Alternatively, host cells which contain a "BREAST CANCER GENE" polynucleotide and which express a 
"BREAST CANCER GENE" polypeptide can be identified by a variety of procedures known to those of skill in the art. 
These procedures include, but are not limited to, DNA-DNA or DNA-RNA hybridization and protein bioassay or immu- 
noassay techniques which include membrane, solution, or chip-based technologies for the detection and/or quantifi- 

35 cation of polynucleotide or protein. For example, the presence of a polynucleotide sequence encoding a "BREAST 
CANCER GENE" polypeptide can be detected by DNA-DNA or DNA-RNA hybridization or amplification using probes 
or fragments or fragments of polynucleotides encoding a "BREAST CANCER GENE" polypeptide. Nucleic acid ampli- 
fication-based assays involve the use of oligonucleotides selected from sequences encoding a "BREAST CANCER 
GENE" polypeptide to detect transformants which contain a "BREAST CANCER GENE" polynucleotide. 

40 [0201 ] A variety of protocols for detecting and measuring the expression of a "BREAST CANCER GENE" polypeptide, 
using either polyclonal or monoclonal antibodies specific for the polypeptide, are known in the art. Examples include 
enzyme-linked immunosorbent assay (ELISA), radioimmunoassay (RIA), and fluorescence activated cell sorting 
(FACS). A two-site, monoclonal-based immunoassay using monoclonal antibodies reactive to two non-interfering 
epitopes on a "BREAST CANCER GENE" polypeptide can be used, or a competitive binding assay can be employed. 

45 These and other assays are described in Hampton et al., (110) and Maddox et al., 111). 

[0202] A wide variety of labels and conjugation techniques are known by those skilled in the art and can be used in 
various nucleic acid and amino acid assays. Means for producing labeled hybridization or PCR probes for detecting 
sequences related to polynucleotides encoding "BREAST CANCER GENE" polypeptides include oligo labeling, nick 
translation, end-labeling, or PCR amplification using a labeled nucleotide. Alternatively, sequences encoding a 

50 "BREAST CANCER GENE" polypeptide can be cloned into a vector for the production of an mRNA probe. Such vectors 
are known in the art, are commercially available, and can be used to synthesize RNA probes in vitro by addition of 
labeled nucleotides and an appropriate RNA polymerase such as T7, T3, or SP6. These procedures can be conducted 
using a variety of commercially available kits (Amersham Pharmacia Biotech, Promega, and US Biochemical). Suitable 
reporter molecules or labels which can be used for ease of detection include radionuclides, enzymes, and fluorescent, 

55 chemiluminescent, or chromogenic agents, as well as substrates, cofactors, inhibitors, magnetic particles, and the like. 
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Expression and Purification of Polypeptides 

[0203] Host cells transformed with nucleotide sequences encoding a "BREAST CANCER GENE" polypeptide can 
be cultured under conditions suitable for the expression and recovery of the protein from cell culture. The polypeptide 

5 produced by a transformed cell can be secreted or stored intracellular depending on the sequence and/or the vector 
used. As will be understood by those of skill in the art, expression vectors containing polynucleotides which encode 
"BREAST CANCER GENE" polypeptides can be designed to contain signal sequences which direct secretion of soluble 
"BREAST CANCER GENE" polypeptides through a prokaryotic or eukaryotic cell membrane or which direct the mem- 
brane insertion of membrane-bound "BREAST CANCER GENE" polypeptide. 

w [0204] As discussed above, other constructions can be used to join a sequence encoding a "BREAST CANCER 
GENE" polypeptide to a nucleotide sequence encoding a polypeptide domain which will facilitate purification of soluble 
proteins. Such purification facilitating domains include, but are not limited to, metal chelating peptides such as histidine- 
tryptophan modules that allow purification on immobilized metals, protein A domains that allow purification on immo- 
bilized immunoglobulin, and the domain utilized in the FLAGS extension/affinity purification system (Immunex Corp., 

15 Seattle, Wash.). Inclusion of cleavable linker sequences such as those specific for Factor Xa or enterokinase (Invitro- 
gen, San Diego, CA) between the purification domain and the "BREAST CANCER GENE" polypeptide also can be 
used to facilitate purification. One such expression vector provides for expression of a fusion protein containing a 
"BREAST CANCER GENE" polypeptide and 6 histidine residues preceding a thioredoxin or an enterokinase cleavage 
site. The histidine residues facilitate purification by IMAC (immobilized metal ion affinity chromatography [Porath et al., 

20 1992, (112)], while the enterokinase cleavage site provides a means for purifying the "BREAST CANCER GENE" 
polypeptide from the fusion protein. Vectors which contain fusion proteins are disclosed in Kroll et al., (113). 

Chemical Synthesis 

25 [0205] Sequences encoding a "BREAST CANCER GENE" polypeptide can be synthesized, in whole or in part, using 
chemical methods well known in the art (see Caruthers et al., (114) and Horn et al., (115). Alternatively, a "BREAST 
CANCER GENE" polypeptide itself can be produced using chemical methods to synthesize its amino acid sequence, 
such as by direct peptide synthesis using solid-phase techniques [Merrifield, 1963, (116) and Roberge et al., 1995, 
(117)]. Protein synthesis can be performed using manual techniques or by automation. Automated synthesis can be 

30 achieved, for example, using Applied Biosystems 431 A Peptide Synthesizer (Perkin Elmer). Optionally, fragments of 
"BREAST CANCER GENE" polypeptides can be separately synthesized and combined using chemical methods to 
produce a full-length molecule. 

[0206] The newly synthesized peptide can be substantially purified by preparative high performance liquid chroma- 
tography [Creighton, 1983, (118)]. The composition of a synthetic "BREAST CANCER GENE" polypeptide can be 
35 confirmed by amino acid analysis or sequencing (e.g., the Edman degradation procedure; see Creighton, (118). Addi- 
tionally, any portion of the amino acid sequence of the "BREAST CANCER GENE" polypeptide can be altered during 
direct synthesis and/or combined using chemical methods with sequences from other proteins to produce a variant 
polypeptide or a fusion protein. 

40 Production of Altered Polypeptides 

[0207] As will be understood by those of skill in the art, it may be advantageous to produce "BREAST CANCER 
GENE" polypeptide-encoding nucleotide sequences possessing non-natural occurring codons. For example, codons 
preferred by a particular prokaryotic or eukaryotic host can be selected to increase the rate of protein expression or to 
45 produce an RNA transcript having desirable properties, such as a half-life which is longer than that of a transcript 
generated from the naturally occurring sequence. 

[0208] The nucleotide sequences disclosed herein can be engineered using methods generally known in the art to 
alter "BREAST CANCER GENE" polypeptide-encoding sequences for a variety of reasons, including but not limited 
to, alterations which modify the cloning, processing, and/or expression of the polypeptide or mRNA product. DNA 
so shuffling by random fragmentation and PCR re-assembly of gene fragments and synthetic oligonucleotides can be 
used to engineer the nucleotide sequences. For example, site-directed mutagenesis can be used to insert new restric- 
tion sites, alter glycosylation patterns, change codon preference, produce splice variants, introduce mutations, and so 
forth. 

55 Predictive, Diagnostic and Prognostic Assays 

[0209] The present invention provides method for determining whether a subject is at risk for developing malignant 
neoplasia and breast cancer in particular by detecting one of the disclosed polynucleotide markers comprising any of 
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the polynucleotides sequences of the SEQ ID NO: 2 to 6, 8, 9, 11 to 16, 18, 19 or 21 to 26 or 53 to 75 and/or the 
polypeptide markers encoded thereby or polypeptide markers comprising any of the polypeptide sequences of the 
SEQ ID NO: 28 to 32, 34, 35, 37 to 42, 44, 45 or 47 to 52 or 76 to 98 or at least 2 of the disclosed polynucleotides 
selected from SEQ ID NO: 1 to 26 and 53 to 75 or the at least 2 of the disclosed polypeptides selected from SEQ ID 

5 NO: 28 to 32 and 76 to 98 for malignant neoplasia and breast cancer in particular. 

[0210] In clinical applications, biological samples can be screened for the presence and/or absence of the biomarkers 
identified herein. Such samples are for example needle biopsy cores, surgical resection samples, or body fluids like 
serum, thin needle nipple aspirates and urine. For example, these methods include obtaining a biopsy, which is op- 
tionally fractionated by cryostat sectioning to enrich diseases cells to about 80% of the total cell population. In certain 

10 embodiments, polynucleotides extracted from these samples may be amplified using techniques well known in the art. 
The expression levels of selected markers detected would be compared with statistically valid groups of diseased and 
healthy samples. 

[0211] In one embodiment the diagnostic method comprises determining whether a subject has an abnormal mRNA 
and/or protein level of the disclosed markers, such as by Northern blot analysis, reverse transcription-polymerase chain 

15 reaction (RT-PCR), in situ hybridization, immunoprecipitation, Western blot hybridization, or immunohistochemistry. 
According to the method, cells are obtained from a subject and the levels of the disclosed biomarkers, protein or mRNA 
level, is determined and compared to the level of these markers in a healthy subject. An abnormal level of the biomarker 
polypeptide or mRNA levels is likely to be indicative of malignant neoplasia such as breast cancer. 
[0212] In another embodiment the diagnostic method comprises determining whether a subject has an abnormal 

20 DNA content of said genes or said genomic loci, such as by Southern blot analysis, dot blot analysis, fluorescence or 
colorimetric In Situ hybridization, comparative genomic hybridization, genotpying by VNTR, STS-PCR or quantitative 
PCR. In general these assays comprise the usage of probes from representative genomic regions. The probes contain 
at least parts of said genomic regions or sequences complementary or analogous to said regions. In particular intra- 
or intergenic regions of said genes or genomic regions. The probes can consist of nucleotide sequences or sequences 

25 of analogous functions (e.g. PNAs, Morpholino oligomers) being able to bind to target regions by hybridization. In 
general genomic regions being altered in said patient samples are compared with unaffected control samples (normal 
tissue from the same or different patients, surrounding unaffected tissue, peripheral blood) or with genomic regions of 
the same sample that don't have said alterations and can therefore serve as internal controls. In a preferred embodiment 
regions located on the same chromosome are used. Alternatively, gonosomal regions and /or regions with defined 

30 varying amount in the sample are used. In one favored embodiment the DNA content, structure, composition or mod- 
ification is compared that lie within distinct genomic regions. Especially favored are methods that detect the DNA 
content of said samples, where the amount of target regions are altered by amplification and or deletions. In another 
embodiment the target regions are analyzed for the presence of polymorphisms (e.g. Single Nucleotide Polymorphisms 
or mutations) that affect or predispose the cells in said samples with regard to clinical aspects, being of diagnostic, 

35 prognostic or therapeutic value. Preferably, the identification of sequence variations is used to define haplotypes that 
result in characteristic behavior of said samples with said clinical aspects. 

[0213] The following examples of genes in 17q12-21.2 are offered byway of illustration, not by way of limitation. 
[021 4] One embodiment of the invention is a method for the prediction, diagnosis or prognosis of malignant neoplasia 
by the detection of at leastl 0, at least 5, or at least 4, or at least 3 and more preferably at least 2 markers whereby the 
40 markers are genes and fragments thereof and/or genomic nucleic acid sequences that are located on one chromosomal 
region which is altered in malignant neoplasia. 

[0215] One further embodiment of the invention is method for the prediction, diagnosis or prognosis of malignant 
neoplasia by the detection of at least 10, at least 5, or at least 4, or at least 3 and more preferably at least 2 markers 
whereby the markers (a) are genes and fragments thereof and/or genomic nucleic acid sequences that are located on 
45 one or more chromosomal region(s) which is/are altered in malignant neoplasia and (b) functionally interact as (i) 
receptor and ligand or (ii) members of the same signal transduction pathway or (iii)members of synergistic signal 
transduction pathways or (iv) members of antagonistic signal transduction pathways or (v) transcription factor and 
transcription factor binding site. 

[0216] In one embodiment, the method for the prediction, diagnosis or prognosis of malignant neoplasia and breast 
50 cancer in particular is done by the detection of: 

(a) polynucleotide selected from the polynucleotides of the SEQ ID NO: 2 to 6, 8, 9, 11 to 16, 18, 19, 21 to 26 or 
53 to 75; 

55 (b) a polynucleotide which hybridizes under stringent conditions to a polynucleotide specified in (a) encoding a 

polypeptide exhibiting the same biological function as specified for the respective sequence in Table 2 or 3; 

(c) a polynucleotide the sequence of which deviates from the polynucleotide specified in (a) and (b) due to the 
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generation of the genetic code encoding a polypeptide exhibiting the same biological function as specified for the 
respective sequence in Table 2 or 3; 

(d) a polynucleotide which represents a specific fragment, derivative or allelic variation of a polynucleotide se- 
5 quence specified in (a) to (c); 

in a biological sample comprising the following steps: hybridizing any polynucleotide or analogous oligomer specified 
in (a) to (do) to a polynucleotide material of a biological sample, thereby forming a hybridization complex; and detecting 
said hybridization complex. 

10 [0217] In another embodiment the method for the prediction, diagnosis or prognosis of malignant neoplasia is done 
as just described but, wherein before hybridization, the polynucleotide material of the biological sample is amplified. 
[0218] In another embodiment the method for the diagnosis or prognosis of malignant neoplasia and breast cancer 
in particular is done by the detection of: 

15 (a) a polynucleotide selected from the polynucleotides of the SEQ ID NO: 2 to 6, 8, 9, 11 to 16, 18, 19, 21 to 26 

or 53 to 75; 

(b) a polynucleotide which hybridizes under stringent conditions to a polynucleotide specified in (a) encoding a 
polypeptide exhibiting the same biological function as specified for the respective sequence in Table 2 or 3; 

20 

(c) a polynucleotide the sequence of which deviates from the polynucleotide specified in (a) and (b) due to the 
generation of the genetic code encoding a polypeptide exhibiting the same biological function as specified for the 
respective sequence in Table 2 or 3; 

25 (d) a polynucleotide which represents a specific fragment, derivative or allelic variation of a polynucleotide se- 

quence specified in (a) to (c); 

(e) a polypeptide encoded by a polynucleotide sequence specified in (a) to (d) 

30 (f) a polypeptide comprising any polypeptide of SEQ ID NO: 28 to 32, 34, 35, 37 to 42, 44, 45, 47 to 52 or 76 to 98; 

comprising the steps of contacting a biological sample with a reagent which specifically interacts with the polynucleotide 
specified in (a) to (d) or the polypeptide specified in (e). 

35 DNA array technology 

[0219] In one embodiment, the present Invention also provides a method wherein polynucleotide probes are immo- 
bilized an a DNA chip in an organized array. Oligonucleotides can be bound to a solid Support by a variety of processes, 
including lithography. For example a chip can hold up to 4100,00 oligonucleotides (GeneChip, Affymetrix). The present 

40 invention provides significant advantages over the available tests for malignant neoplasia, such as breast cancer, 
because it increases the reliability of the test by providing an array of polynucleotide markers an a single chip. 
[0220] The method includes obtaining a biopsy of an affected person, which is optionally fractionated by cryostat 
sectioning to enrich diseased cells to about 80% of the total cell population and the use of body fluids such as serum 
or urine, serum or cell containing liquids (e.g. derived from fine needle aspirates). The DNA or RNA is then extracted, 

45 amplified, and analyzed with a DNA chip to determine the presence of absence of the marker polynucleotide sequences. 
In one embodiment, the polynucleotide probes are spotted onto a substrate in a two-dimensional matrix or array, sam- 
ples of polynucleotides can be labeled and then hybridized to the probes. Double-stranded polynucleotides, comprising 
the labeled sample polynucleotides bound to probe polynucleotides, can be detected once the unbound portion of the 
sample is washed away. 

so [0221] The probe polynucleotides can be spotted an substrates including glass, nitrocellulose, etc. The probes can 
be bound to the Substrate by either covalent bonds or by non-specific interactions, such as hydrophobic interactions. 
The sample polynucleotides can be labeled using radioactive labels, fluorophores, chromophores, etc. Techniques for 
constructing arrays and methods of using these arrays are described in EP 0 799 897; WO 97/29212; WO 97/27317; 
EP 0 785 280; WO 97/02357; U.S. Pat. No. 5,593,839; U.S. Pat. No. 5,578,832; EP 0 728 520; U.S. Pat. No. 5,599,695; 

55 EP 0 721 016; U.S. Pat. No. 5,556,752; WO 95/22058; and U.S. Pat. No. 5,631,734. 

[0222] Further, arrays can be used to examine differential expression of genes and can be used to determine gene 
function. For example, arrays of the instant polynucleotide sequences can be used to determine if any of the polynu- 
cleotide sequences are differentially expressed between normal cells and diseased cells, for example. High expression 
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of a particular message in a diseased sample, which is not observed in a corresponding normal sample, can indicate 
a breast cancer specific protein. 

[0223] Accordingly, in one aspect, the invention provides probes and primers that are specific to the unique polynu- 
cleotide markers disclosed herein. 
5 [0224] In one embodiment, the method comprises using a polynucleotide probe to determine the presence of ma- 
lignant or breast cancer cells in particular in a tissue from a patient. Specifically, the method comprises: 

1) providing a polynucleotide probe comprising a nucleotide sequence at least 12 nucleotides in length, preferably 
at least 15 nucleotides, more preferably, 25 nucleotides, and most preferably at least 40 nucleotides, and up to all 

10 or nearly all of the coding sequence which is complementary to a portion of the coding sequence of a polynucleotide 

selected from the polynucleotides of SEQ ID NO: 1 to 26 and 53 to 75 or a sequence complementary thereto and is 

2) differentially expressed in malignant neoplasia, such as breast cancer; 
15 3) obtaining a tissue sample from a patient with malignant neoplasia; 

4) providing a second tissue sample from a patient with no malignant neoplasia; 

5) contacting the polynucleotide probe under stringent conditions with RNA of each of said first and second tissue 
20 samples (e.g., in a Northern blot or in situ hybridization assay); and 

6) comparing (a) the amount of hybridization of the probe with RNA of the first tissue sample, with (b) the amount 
of hybridization of the probe with RNA of the second tissue sample; 

25 wherein a statistically significant difference in the amount of hybridization with the RNA of the first tissue sample as 
compared to the amount of hybridization with the RNA of the second tissue sample is indicative of malignant neoplasia 
and breast cancer in particular in the first tissue sample. 

Data analysis methods 

[0225] Comparison of the expression levels of one or more "BREAST CANCER GENES" with reference expression 
levels, e.g., expression levels in diseased cells of breast cancer or in normal counterpart cells, is preferably conducted 
using computer systems. In one embodiment, expression levels are obtained in two cells and these two sets of ex- 
pression levels are introduced into a computer system for comparison. In a preferred embodiment, one set of expression 
35 levels is entered into a computer system for comparison with values that are already present in the computer system, 
or in computer-readable form that is then entered into the computer system. 

[0226] In one embodiment, the invention provides a computer readable form of the gene expression profile data of 
the invention, or of values corresponding to the level of expression of at least one "BREAST CANCER GENE" in a 
diseased cell. The values can be mRNA expression levels obtained from experiments, e.g., microarray analysis. The 
40 values can also be mRNA levels normalised relative to a reference gene whose expression is constant in numerous 
cells under numerous conditions, e.g., GAPDH. In other embodiments, the values in the computer are ratios of, or 
differences between, normalized or non-normalized mRNA levels in different samples. 

[0227] The gene expression profile data can be in the form of a table, such as an Excel table. The data can be alone, 
or it can be part of a larger database, e.g., comprising other expression profiles. For example, the expression profile 
45 data of the invention can be part of a public database. The computer readable form can be in a computer. In another 
embodiment, the invention provides a computer displaying the gene expression profile data. 

[0228] In one embodiment, the invention provides a method for determining the similarity between the level of ex- 
pression of one or more "BREAST CANCER GENES" in a first cell, e.g., a cell of a subject, and that in a second cell, 
comprising obtaining the level of expression of one or more "BREAST CANCER GENES" in a first cell and entering 
so these values into a computer comprising a database including records comprising values corresponding to levels of 
expression of one or more "BREAST CANCER GENES" in a second cell, and processor instructions, e.g., a user 
interface, capable of receiving a selection of one or more values for comparison purposes with data that is stored in 
the computer. The computer may further comprise a means for converting the comparison data into a diagram or chart 
or other type of output. 

55 [0229] In another embodiment, values representing expression levels of "BREAST CANCER GENES" are entered 
into a computer system, comprising one or more databases with reference expression levels obtained from more than 
one cell. For example, the computer comprises expression data of diseased and normal cells. Instructions are provided 
to the computer, and the computer is capable of comparing the data entered with the data in the computer to determine 
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whether the data entered is more similar to that of a normal cell or of a diseased cell. 

[0230] In another embodiment, the computer comprises values of expression levels in cells of subjects at different 
stages of breast cancer, and the computer is capable of comparing expression data entered into the computer with 
the data stored, and produce results indicating to which of the expression profiles in the computer, the one entered is 

5 most similar, such as to determine the stage of breast cancer in the subject. 

[0231] In yet another embodiment, the reference expression profiles in the computer are expression profiles from 
cells of breast cancer of one or more subjects, which cells are treated in vivo or in vitro with a drug used for therapy 
of breast cancer. Upon entering of expression data of a cell of a subject treated in vitro or in vivo with the drug, the 
computer is instructed to compare the data entered to the data in the computer, and to provide results indicating whether 

10 the expression data input into the computer are more similar to those of a cell of a subject that is responsive to the 
drug or more similar to those of a cell of a subject that is not responsive to the drug. Thus, the results indicate whether 
the subject is likely to respond to the treatment with the drug or unlikely to respond to it. 

[0232] In one embodiment, the invention provides a system that comprises a means for receiving gene expression 
data for one or a plurality of genes; a means for comparing the gene expression data from each of said one or plurality 
15 of genes to a common reference frame; and a means for presenting the results of the comparison. This system may 
further comprise a means for clustering the data. 

[0233] In another embodiment, the invention provides a computer program for analyzing gene expression data com- 
prising (i) a computer code that receives as input gene expression data for a plurality of genes and (ii) a computer code 
that compares said gene expression data from each of said plurality of genes to a common reference frame. 

20 [0234] The invention also provides a machine-readable or computer-readable medium including program instructions 
for performing the following steps: (i) comparing a plurality of values corresponding to expression levels of one or more 
genes characteristic of breast cancer in a query cell with a database including records comprising reference expression 
or expression profile data of one or more reference cells and an annotation of the type of cell; and (ii) indicating to 
which cell the query cell is most similar based on similarities of expression profiles. The reference cells can be cells 

25 from subjects at different stages of breast cancer. The reference cells can also be cells from subjects responding or 
not responding to a particular drug treatment and optionally incubated in vitro or in vivo with the drug. 
[0235] The reference cells may also be cells from subjects responding or not responding to several different treat- 
ments, and the computer system indicates a preferred treatment for the subject. Accordingly, the invention provides a 
method for selecting a therapy for a patient having breast cancer, the method comprising: (i) providing the level of 

30 expression of one or more genes characteristic of breast cancer in a diseased cell of the patient; (ii) providing a plurality 
of reference profiles, each associated with a therapy, wherein the subject expression profile and each reference profile 
has a plurality of values, each value representing the level of expression of a gene characteristic of breast cancer; and 
(iii) selecting the reference profile most similar to the subject expression profile, to thereby select a therapy for said 
patient. In a preferred embodiment step (iii) is performed by a computer. The most similar reference profile may be 

35 selected by weighing a comparison value of the plurality using a weight value associated with the corresponding ex- 
pression data. 

[0236] The relative abundance of an mRNA in two biological samples can be scored as a perturbation and its mag- 
nitude determined (i.e., the abundance is different in the two sources of mRNA tested), or as not perturbed (i.e., the 
relative abundance is the same). In various embodiments, a difference between the two sources of RNA of at least a 
40 factor of about 25% (RNA from one source is 25% more abundant in one source than the other source), more usually 
about 50%, even more often by a factor of about 2 (twice as abundant), 3 (three times as abundant) or 5 (five times 
as abundant) is scored as a perturbation. Perturbations can be used by a computer for calculating and expression 
comparisons. 

[0237] Preferably, in addition to identifying a perturbation as positive or negative, it is advantageous to determine 
45 the magnitude of the perturbation. This can be carried out, as noted above, by calculating the ratio of the emission of 
the two fluorophores used for differential labeling, or by analogous methods that will be readily apparent to those of 
skill in the art. 

[0238] The computer readable medium may further comprise a pointer to a descriptor of a stage of breast cancer or 
to a treatment for breast cancer. 

so [0239] In operation, the means for receiving gene expression data, the means for comparing the gene expression 
data, the means for presenting, the means for normalizing, and the means for clustering within the context of the 
systems of the present invention can involve a programmed computer with the respective functionalities described 
herein, implemented in hardware or hardware and software; a logic circuit or other component of a programmed com- 
puter that performs the operations specifically identified herein, dictated by a computer program; or a computer memory 

55 encoded with executable instructions representing a computer program that can cause a computer to function in the 
particular fashion described herein. 

[0240] Those skilled in the art will understand that the systems and methods of the present invention may be applied 
to a variety of systems, including IBM-compatible personal computers running MS-DOS or Microsoft Windows. 



40 



EP 1 365 034 A2 



[0241] The computer may have internal components linked to external components. The internal components may 
include a processor element interconnected with a main memory. The computer system can be an Intel Pentium®- 
based processor of 200 MHz or greater clock rate and with 32 MB or more of main memory. The external component 
may comprise a mass storage, which can be one or more hard disks (which are typically packaged together with the 

5 processor and memory). Such hard disks are typically of 1 GB or greater storage capacity. Other external components 
include a user interface device, which can be a monitor, together with an inputing device, which can be a "mouse", or 
other graphic input devices, and/or a keyboard. A printing device can also be attached to the computer. 
[0242] Typically, the computer system is also linked to a network link, which can be part of an Ethernet link to other 
local computer systems, remote computer systems, or wide area communication networks, such as the Internet. This 

10 network link allows the computer system to share data and processing tasks with other computer systems. 

[0243] Loaded into memory during operation of this system are several software components, which are both stand- 
ard in the art and special to the instant invention. These software components collectively cause the computer system 
to function according to the methods of this invention. These software components are typically stored on a mass 
storage. A software component represents the operating system, which is responsible for managing the computer 

15 system and its network interconnections. This operating system can be, for example, of the Microsoft Windows' family, 
such as Windows 95, Windows 98, or Windows NT. A software component represents common languages and functions 
conveniently present on this system to assist programs implementing the methods specific to this invention. Many high 
or low level computer languages can be used to program the analytic methods of this invention. Instructions can be 
interpreted during run-time or compiled. Preferred languages include C/C++, and JAVA®. Most preferably, the methods 

20 of this invention are programmed in mathematical software packages which allow symbolic entry of equations and 
high-level specification of processing, including algorithms to be used, thereby freeing a user of the need to procedurally 
program individual equations or algorithms. Such packages include Matlab from Mathworks (Natick, Mass.), Mathe- 
matica from Wolfram Research (Champaign, III.), or S-Plus from Math Soft (Cambridge, Mass.). Accordingly, a software 
component represents the analytic methods of this invention as programmed in a procedural language or symbolic 

25 package. In a preferred embodiment, the computer system also contains a database comprising values representing 
levels of expression of one or more genes characteristic of breast cancer. The database may contain one or more 
expression profiles of genes characteristic of breast cancer in different cells. 

[0244] In an exemplary implementation, to practice the methods of the present invention, a userfirst loads expression 
profile data into the computer system. These data can be directly entered by the user from a monitor and keyboard, 

30 or from other computer systems linked by a network connection, or on removable storage media such as a CD-ROM 
or floppy disk or through the network. Next the user causes execution of expression profile analysis software which 
performs the steps of comparing and, e.g., clustering co-varying genes into groups of genes. 
[0245] In another exemplary implementation, expression profiles are compared using a method described in U.S. 
Patent No. 6,203,987. A user first loads expression profile data into the computer system. Geneset profile definitions 

35 are loaded into the memory from the storage media or from a remote computer, preferably from a dynamic geneset 
database system, through the network. Next the user causes execution of projection software which performs the steps 
of converting expression profile to projected expression profiles. The projected expression profiles are then displayed. 
[0246] In yet another exemplary implementation, a userfirst leads a projected profile into the memory. The user then 
causes the loading of a reference profile into the memory. Next, the user causes the execution of comparison software 

40 which performs the steps of objectively comparing the profiles. 

Detection of variant polynucleotide sequence 

[0247] In yet another embodiment, the invention provides methods for determining whether a subject is at risk for 
45 developing a disease, such as a predisposition to develop malignant neoplasia, for example breast cancer, associated 
with an aberrant activity of any one of the polypeptides encoded by any of the polynucleotides of the SEQ ID NO: 1 to 
26 or 53 to 75, wherein the aberrant activity of the polypeptide is characterized by detecting the presence or absence 
of a genetic lesion characterized by at least one of these: 

50 (i) an alteration affecting the integrity of a gene encoding a marker polypeptides, or 

(ii) the misexpression of the encoding polynucleotide. 

[0248] To illustrate, such genetic lesions can be detected by ascertaining the existence of at least one of these: 

55 

I. a deletion of one or more nucleotides from the polynucleotide sequence 

II. an addition of one or more nucleotides to the polynucleotide sequence 
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III. a substitution of one or more nucleotides of the polynucleotide sequence 

IV. a gross chromosomal rearrangement of the polynucleotide sequence 

5 V. a gross alteration in the level of a messenger RNA transcript of the polynucleotide sequence 

VI. aberrant modification of the polynucleotide sequence, such as of the methylation pattern of the genomic DNA 

VII. the presence of a non-wild type splicing pattern of a messenger RNA transcript of the gene 

w 

VIII. a non-wild type level of the marker polypeptide 

IX. allelic loss of the gene 
15 X. allelic gain of the gene 

XI. inappropriate post-translational modification of the marker polypeptide 

[0249] The present Invention provides assay techniques for detecting mutations in the encoding polynucleotide se- 
20 quence. These methods include, but are not limited to, methods involving sequence analysis, Southern blot hybridi- 
zation, restriction enzyme site mapping, and methods involving detection of absence of nucleotide pairing . between 
the polynucleotide to be analyzed and a probe. 

[0250] Specific diseases or disorders, e.g., genetic diseases or disorders, are associated with specific allelic variants 
of polymorphic regions of certain genes, which do not necessarily encode a mutated protein. Thus, the presence of a 

25 specific allelic variant of a polymorphic region of a gene in a subject can render the subject susceptible to developing 
a specific disease or disorder. Polymorphic regions in genes, can be identified, by determining the nucleotide sequence 
of genes in populations of individuals. If a polymorphic region is identified, then the link with a specific disease can be 
determined by studying specific populations of individuals, e.g. individuals which developed a specific disease, such 
as breast cancer. A polymorphic region can be located in any region of a gene, e.g., exons, in coding or non coding 

30 regions of exons, introns, and promoter region. 

[0251] In an exemplary embodiment, there is provided a polynucleotide composition comprising a polynucleotide 
probe including a region of nucleotide sequence which is capable of hybridising to a sense or antisense sequence of 
a gene or naturally occurring mutants thereof, or 5' or 3' flanking sequences or intronic sequences naturally associated 
with the subject genes or naturally occurring mutants thereof. The polynucleotide of a cell is rendered accessible for 

35 hybridization, the probe is contacted with the polynucleotide of the sample, and the hybridization of the probe to the 
sample polynucleotide is detected. Such techniques can be used to detect lesions or allelic variants at either the ge- 
nomic or mRNA level, including deletions, substitutions, etc., as well as to determine mRNA transcript levels. 
[0252] A preferred detection method is allele specific hybridization using probes overlapping the mutation or poly- 
morphic site and having about 5, 10, 20, 25, or 30 nucleotides around the mutation or polymorphic region. In a preferred 

40 embodiment of the invention, several probes capable of hybridising specifically to allelic variants are attached to a solid 
phase support, e.g., a "chip". Mutation detection analysis using these chips comprising oligonucleotides, also termed 
"DNA probe arrays" is described e.g., in Croninet al. (119). In one embodiment, a chip comprises all the allelic variants 
of at least one polymorphic region of a gene. The solid phase support is then contacted with a test polynucleotide and 
hybridization to the specific probes is detected. Accordingly, the identity of numerous allelic variants of one or more 

45 genes can be identified in a simple hybridization experiment. 

[0253] In certain embodiments, detection of the lesion comprises utilizing the probe/primer in a polymerase chain 
reaction (PCR) (see, e.g. U.S. Patent Nos. 4,683,195 and 4,683,202), such as anchor PCR or RACE PCR, or, alter- 
natively, in a ligase chain reaction (LCR) [Landegran et al., 1988, (120) and Nakazawa et al., 1994 (121)], the latter of 
which can be particularly useful for detecting point mutations in the gene; Abravaya et al., 1995 ,(122)]. In a merely 

50 illustrative embodiment, the method includes the steps of (i) collecting a sample of cells from a patient, (ii) isolating 
polynucleotide (e.g., genomic, mRNA or both) from the cells of the sample, (iii) contacting the polynucleotide sample 
with one or more primers which specifically hybridize to a polynucleotide sequence under conditions such that hybrid- 
ization and amplification of the polynucleotide (if present) occurs, and (iv) detecting the presence or absence of an 
amplification product, or detecting the size of the amplification product and comparing the length to a control sample. 

55 it is anticipated that PCR and/or LCR may be desirable to use as a preliminary amplification step in conjunction with 
any of the techniques used for detecting mutations described herein. 

[0254] Alternative amplification methods include: self sustained sequence replication [Guatelli, J.C. et al., 1990, 
(123)], transcriptional amplification system [Kwoh, D.Y.etal., 1989, (124)], Q-Beta replicase [Lizardi, P.M. et al., 1988 , 
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(125)], or any other polynucleotide amplification method, followed by the detection of the amplified molecules using 
techniques well known to those of skill in the art. These detection schemes are especially useful for the detection of 
polynucleotide molecules if such molecules are present in very low numbers. 

[0255] In a preferred embodiment of the subject assay, mutations in, or allelic variants, of a gene from a sample cell 
5 are identified by alterations in restriction enzyme cleavage patterns. For example, sample and control DNA is isolated, 
amplified (optionally), digested with one or more restriction endonucleases, and fragment length sizes are determined 
by gel electrophoresis. Moreover; the use of sequence specific ribozymes (see, for example, U.S. Patent No. 5,498,531 ) 
can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site. 

10 In situ hybridization 

[0256] In one aspect, the method comprises in situ hybridization with a probe derived from a given marker polynu- 
cleotide, which sequence is selected from any of the polynucleotide sequences of the SEQ ID NO: 1 to 9, or 11 to 19 
or 21 to 26 and 53 to 75 or a sequence complementary thereto. The method comprises contacting the labeled hybrid- 
's ization probe with a sample of a given type of tissue from a patient potentially having malignant neoplasia and breast 
cancer in particular as well as normal tissue from a person with no malignant neoplasia, and determining whether the 
probe labels tissue of the patient to a degree significantly different (e.g., by at least a factor of two, or at least a factor 
of five, or at least a factor of twenty, or at least a factor of fifty) than the degree to which normal tissue is labelled. 

20 Polypeptide detection 

[0257] The subject invention further provides a method of determining whether a cell sample obtained from a subject 
possesses an abnormal amount of marker polypeptide which comprises (a) obtaining a cell sample from the subject, 
(b) quantitatively determining the amount of the marker polypeptide in the sample so obtained, and (c) comparing the 
25 amount of the marker polypeptide so determined with a known standard, so as to thereby determine whether the cell 
sample obtained from the subject possesses an abnormal amount of the marker polypeptide. Such marker polypeptides 
may be detected by immunohistochemical assays, dot-blot assays, ELISA and the like. 

Antibodies 

[0258] Any type of antibody known in the art can be generated to bind specifically to an epitope of a "BREAST 
CANCER GENE" polypeptide. An antibody as used herein includes intact immunoglobulin molecules, as well as frag- 
ments thereof, such as Fab, F(ab) 2 , and Fv, which are capable of binding an epitope of a "BREAST CANCER GENE" 
polypeptide. Typically, at least 6, 8, 1 0, or 1 2 contiguous amino acids are required to form an epitope. However, epitopes 

35 which involve non-contiguous amino acids may require more, e.g., at least 15, 25, or 50 amino acids. 

[0259] An antibody which specifically binds to an epitope of a "BREAST CANCER GENE" polypeptide can be used 
therapeutically, as well as in immunochemical assays, such as Western blots, ELISAs, radioimmunoassays, immuno- 
histochemical assays, immunoprecipitations, or other immunochemical assays known in the art. Various immu- 
noassays can be used to identify antibodies having the desired specificity. Numerous protocols for competitive binding 

40 or immunoradiometric assays are well known in the art. Such immunoassays typically involve the measurement of 
complex formation between an immunogen and an antibody which specifically binds to the immunogen. 
[0260] Typically, an antibody which specifically binds to a "BREAST CANCER GENE" polypeptide provides a detec- 
tion signal at least 5-, 10-, or 20-fold higher than a detection signal provided with other proteins when used in an 
immunochemical assay. Preferably, antibodies which specifically bind to "BREAST CANCER GENE" polypeptides do 

45 not detect other proteins in immunochemical assays and can immunoprecipitate a "BREAST CANCER GENE" polypep- 
tide from solution. 

[0261] "BREAST CANCER GENE" polypeptides can be used to immunize a mammal, such as a mouse, rat, rabbit, 
guinea pig, monkey, or human, to produce polyclonal antibodies. If desired, a "BREAST CANCER GENE" polypeptide 
can be conjugated to a carrier protein, such as bovine serum albumin, thyroglobulin, and keyhole limpet hemocyanin. 
50 Depending on the host species, various adjuvants can be used to increase the immunological response. Such adjuvants 
include, but are not limited to, Freund's adjuvant, mineral gels (e.g., aluminum hydroxide), and surface active sub- 
stances (e.g. lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet hemocyanin, and dini- 
trophenol). Among adjuvants used in humans, BCG (bacilli Calmette-Guerin) and Corynebacterium parvum are espe- 
cially useful. 

55 [0262] Monoclonal antibodies which specifically bind to a "BREAST CANCER GENE" polypeptide can be prepared 
using any technique which provides for the production of antibody molecules by continuous cell lines in culture. These 
techniques include, but are not limited to, the hybridoma technique, the human B cell hybridoma technique, and the 
EBV hybridoma technique [Kohler et al., 1985, (136); Kozboret al., 1985, (137); Cote et al., 1983, (138) and Cole et 
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al., 1984, (139)]. 

[0263] In addition, techniques developed for the production of chimeric antibodies, the splicing of mouse antibody 
genes to human antibody genes to obtain a molecule with appropriate antigen specificity and biological activity, can 
be used [Morrison etal., 1984, (140); Neubergeret al., 1984, (141); Takeda et al., 1985, (142)]. Monoclonal and other 
5 antibodies also can be humanized to prevent a patient from mounting an immune response against the antibody when 
it is used therapeutically. Such antibodies may be sufficiently similar in sequence to human antibodies to be used 
directly in therapy or may require alteration of a few key residues. Sequence differences between rodent antibodies 
and human sequences can be minimized by replacing residues which differ from those in the human sequences by 
site directed mutagenesis of individual residues or by grating of entire complementarity determining regions. Alterna- 
te tively, humanized antibodies can be produced using recombinant methods, as described in GB2188638B. Antibodies 
which specifically bind to a "BREAST CANCER GENE" polypeptide can contain antigen binding sites which are either 
partially or fully humanized, as disclosed in U.S. Patent 5,565,332. 

[0264] Alternatively, techniques described for the production of single chain antibodies can be adapted using methods 
known in the art to produce single chain antibodies which specifically bind to "BREAST CANCER GENE" polypeptides. 
15 Antibodies with related specificity, but of distinct idiotypic composition, can be generated by chain shuffling from random 
combinatorial immunoglobulin libraries [Burton, 1991, (143)]. 

[0265] Single-chain antibodies also can be constructed using a DNA amplification method, such as PCR, using 
hybridoma cDNA as a template [Thirion et al., 1996, (144)]. Single-chain antibodies can be mono- or bispecific, and 
can be bivalent or tetravalent. Construction of tetravalent, bispecific single-chain antibodies is taught, for example, in 
20 Coloma & Morrison, (145). Construction of bivalent, bispecific single-chain antibodies is taught in Mallender & Voss, 
(146). 

[0266] A nucleotide sequence encoding a single-chain antibody can be constructed using manual or automated 
nucleotide synthesis, cloned into an expression construct using standard recombinant DNA methods, and introduced 
into a cell to express the coding sequence, as described below. Alternatively, single-chain antibodies can be produced 

25 directly using, for example, filamentous phage technology [Verhaar et al., 1995, (147); Nicholls et al., 1993, (148)]. 
[0267] Antibodies which specifically bind to "BREAST CANCER GENE" polypeptides also can be produced by in- 
ducing in vivo production in the lymphocyte population or by screening immunoglobulin libraries or panels of highly 
specific binding reagents as disclosed in the literature [Orlandi et al., 1989, (149) and Winter et al., 1991, (150)]. 
[0268] Other types of antibodies can be constructed and used therapeutically in methods of the invention. For ex- 

30 ample, chimeric antibodies can be constructed as disclosed in WO 93/03151 . Binding proteins which are derived from 
immunoglobulins and which are multivalent and multispecific, such as the antibodies described in WO 94/13804, also 
can be prepared. 

[0269] Antibodies according to the invention can be purified by methods well known in the art. For example, antibodies 
can be affinity purified by passage over a column to which a "BREAST CANCER GENE" polypeptide is bound. The 

35 bound antibodies can then be eluted from the column using a buffer with a high salt concentration. 

[0270] Immunoassays are commonly used to quantify the levels of proteins in cell samples, and many other immu- 
noassay techniques are known in the art. The invention is not limited to a particular assay procedure, and therefore is 
intended to include both homogeneous and heterogeneous procedures. Exemplary immunoassays which can be con- 
ducted according to the invention include fluorescence polarisation immunoassay (FPIA), fluorescence immunoassay 

40 (FIA), enzyme immunoassay (EIA), nephelometric inhibition immunoassay (NIA), enzyme linked immunosorbent assay 
(ELISA), and radioimmunoassay (RIA). An indicator moiety, or label group, can be attached to the subject antibodies 
and is selected so as to meet the needs of various uses of the method which are often dictated by the availability of 
assay equipment and compatible immunoassay procedures. General techniques to be used in performing the various 
immunoassays noted above are known to those of ordinary skill in the art. 

45 [0271 ] In another embodiment, the level of at least one product encoded by any of the polynucleotide sequences of 
the SEQ ID NO: 2 to 6, 8, 9, 1 1 to 1 6, 1 8, 19 or 21 to 26 or 53 to 75 or of at least 2 products encoded by a polynucleotide 
selected from SEQ ID NO: 1 to 26 and 53 to 75 or a sequence complementary thereto, in a biological fluid (e.g., blood 
or urine) of a patient may be determined as a way of monitoring the level of expression of the marker polynucleotide 
sequence in cells of that patient. Such a method would include the steps of obtaining a sample of a biological fluid 

50 from the patient, contacting the sample (or proteins from the sample) with an antibody specific for a encoded marker 
polypeptide, and determining the amount of immune complex formation by the antibody, with the amount of immune 
complex formation being indicative of the level of the marker encoded product in the sample. This determination is 
particularly instructive when compared to the amount of immune complex formation by the same antibody in a control 
sample taken from a normal individual or in one or more samples previously or subsequently obtained from the same 

55 person. 

[0272] In another embodiment, the method can be used to determine the amount of marker polypeptide present in 
a cell, which in turn can be correlated with progression of the disorder, e.g., plaque formation. The level of the marker 
polypeptide can be used predictively to evaluate whether a sample of cells contains cells which are, or are predisposed 
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towards becoming, plaque associated cells. The observation of marker polypeptide level can be utilized in decisions 
regarding, e.g., the use of more stringent therapies. 

[0273] As set out above, one aspect of the present invention relates to diagnostic assays for determining, in the 
context of cells isolated from a patient, if the level of a marker polypeptide is significantly reduced in the sample cells. 

5 The term "significantly reduced" refers to a cell phenotype wherein the cell possesses a reduced cellular amount of 
the marker polypeptide relative to a normal cell of similar tissue origin. For example, a cell may have less than about 
50%, 25%, 10%, or 5% of the marker polypeptide that a normal control cell. In particular, the assay evaluates the level 
of marker polypeptide in the test cells, and, preferably, compares the measured level with marker polypeptide detected 
in at least one control cell, e.g., a normal cell and/or a transformed cell of known phenotype. 

10 [0274] Of particular importance to the subject invention is the ability to quantify the level of marker polypeptide as 
determined by the number of cells associated with a normal or abnormal marker polypeptide level. The number of cells 
with a particular marker polypeptide phenotype may then be correlated with patient prognosis. In one embodiment of 
the invention, the marker polypeptide phenotype of the lesion is determined as a percentage of cells in a biopsy which 
are found to have abnormally high/low levels of the marker polypeptide. Such expression may be detected by immu- 

15 nohistochemical assays, dot-blot assays, ELISA and the like. 

Immunohistochemistry 

[0275] Where tissue samples are employed, immunohistochemical staining may be used to determine the number 
20 of cells having the marker polypeptide phenotype. For such staining, a multiblock of tissue is taken from the biopsy or 
other tissue sample and subjected to proteolytic hydrolysis, employing such agents as protease K or pepsin. In certain 
embodiments, it may be desirable to isolate a nuclear fraction from the sample cells and detect the level of the marker 
polypeptide in the nuclear fraction. 

[0276] The tissues samples are fixed by treatment with a reagent such as formalin, glutaraldehyde, methanol, or the 
25 like. The samples are then incubated with an antibody, preferably a monoclonal antibody, with binding specificity for 
the marker polypeptides. This antibody may be conjugated to a Label for subsequent detection of binding, samples 
are incubated for a time Sufficient for formation of the immuno-complexes. Binding of the antibody is then detected by 
virtue of a Label conjugated to this antibody. Where the antibody is unlabelled, a second labeled antibody may be 
employed, e.g., which is specific for the isotype of the anti-marker polypeptide antibody. Examples of labels which may 
30 be employed include radionuclides, fluorescence, chemiluminescence, and enzymes. 

[0277] Where enzymes are employed, the Substrate for the enzyme may be added to the samples to provide a 
colored or fluorescent product. Examples of suitable enzymes for use in conjugates include horseradish peroxidase, 
alkaline phosphatase, malate dehydrogenase and the like. Where not commercially available, such antibody-enzyme 
conjugates are readily produced by techniques known to those skilled in the art. 
35 [0278] In one embodiment, the assay is performed as a dot blot assay. The dot blot assay finds particular application 
where tissue samples are employed as it allows determination of the average amount of the marker polypeptide as- 
sociated with a Single cell by correlating the amount of marker polypeptide in a cell-free extract produced from a 
predetermined number of cells. 

[0279] In yet another embodiment, the invention contemplates using one or more antibodies which are generated 
40 against one or more of the marker polypeptides of this invention, which polypeptides are encoded by any of the poly- 
nucleotide sequences of the SEQ ID NO: 1 to 26 or 53 to 75. Such a panel of antibodies may be used as a reliable 
diagnostic probe for breast cancer. The assay of the present invention comprises contacting a biopsy sample containing 
cells, e.g., macrophages, with a panel of antibodies to one or more of the encoded products to determine the presence 
or absence of the marker polypeptides. 
45 [0280] The diagnostic methods of the subject invention may also be employed as follow-up to treatment, e.g., quan- 
tification of the level of marker polypeptides may be indicative of the effectiveness of current or previously employed 
therapies for malignant neoplasia and breast cancer in particular as well as the effect of these therapies upon patient 
prognosis. 

[0281] The diagnostic assays described above can be adapted to be used as prognostic assays, as well. Such an 
50 application takes advantage of the sensitivity of the assays of the Invention to events which take place at characteristic 
stages in the progression of plaque generation in case of malignant neoplasia. For example, a given marker gene may 
be up- or down-regulated at a very early stage, perhaps before the cell is developing into a foam cell, while another 
marker gene may be characteristically up or down regulated only at a much later stage. Such a method could involve 
the steps of contacting the mRNA of a test cell with a polynucleotide probe derived from a given marker polynucleotide 
55 which is expressed at different characteristic levels in breast cancer tissue cells at different stages of malignant neo- 
plasia progression, and determining the approximate amount of hybridization of the probe to the mRNA of the cell, 
such amount being an indication of the level of expression of the gene in the cell, and thus an indication of the stage 
of disease progression of the cell; alternatively, the assay can be carried out with an antibody specific for the gene 
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product of the given marker polynucleotide, contacted with the proteins of the test cell. A battery of such tests will 
disclose not only the existence of a certain arteriosclerotic plaque, but also will allow the clinician to select the mode 
of treatment most appropriate for the disease, and to predict the likelihood of success of that treatment. 
[0282] The methods of the invention can also be used to follow the clinical course of a given breast cancer predis- 
5 position. For example, the assay of the Invention can be applied to a blood sample from a patient; following treatment 
of the patient for BREAST CANCER, another blood sample is taken and the test repeated. Successful treatment will 
result in removal of demonstrate differential expression, characteristic of the breast cancer tissue cells, perhaps ap- 
proaching or even surpassing normal levels. 

10 Polypeptide activity 

[0283] In one embodiment the present invention provides a method for screening potentially therapeutic agents which 
modulate the activity of one or more "BREAST CANCER GENE" polypeptides, such that if the activity of the polypeptide 
is increased as a result of the upregulation of the "BREAST CANCER GENE" in a subject having or at risk for malignant 

15 neoplasia and breast cancer in particular, the therapeutic substance will decrease the activity of the polypeptide relative 
to the activity of the some polypeptide in a subject not having or not at risk for malignant neoplasia or breast cancer 
in particular but not treated with the therapeutic agent. Likewise, if the activity of the polypeptide as a result of the 
downregulation of the "BREAST CANCER GENE" is decreased in a subject having or at risk for malignant neoplasia 
or breast cancer in particular, the therapeutic agent will increase the activity of the polypeptide relative to the activity 

20 of the same polypeptide in a subject not having or not at risk for malignant neoplasia or breast cancer in particular, but 
not treated with the therapeutic agent. 

[0284] The activity of the "BREAST CANCER GENE" polypeptides indicated in Table 2 or 3 may be measured by 
any means known to those of skill in the art, and which are particular for the type of activity performed by the particular 
polypeptide. Examples of specific assays which may be used to measure the activity of particular polynucleotides are 
25 shown below. 

a) G protein coupled receptors 

[0285] In one embodiment, the "BREAST CANCER GENE" polynucleotide may encode a G protein coupled receptor. 
30 in one embodiment, the present invention provides a method of screening potential modulators (inhibitors or activators) 
of the G protein coupled receptor by measuring changes in the activity of the receptor in the presence of a candidate 
modulator. 

1. G r coupled receptors 

35 

[0286] Cells (such as CHO cells or primary cells) are stably transfected with the relevant receptor and with an in- 
ducible CRE-luciferase construct. Cells are grown in 50% Dulbecco's modified Eagle medium / 50% F12 (DMEM/F12) 
supplemented with 10% FBS, at 37°C in a humidified atmosphere with 10% C0 2 and are routinely split at a ratio of 1 : 
10 every 2 or 3 days. Test cultures are seeded into 384 - well plates at an appropriate density (e.g. 2000 cells / well in 

40 35 u.l cell culture medium) in DMEM/F12 with FBS, and are grown for 48 hours (range:- 24 - 60 hours, depending on 
cell line). Growth medium is then exchanged against serum free medium (SFM; e.g. Ultra-CHO), containing 0,1% BSA. 
Test compounds dissolved in DMSO are diluted in SFM and transferred to the test cultures (maximal final concentration 
10 umolar), followed by addition of forskolin (~ 1 umolar, final cone.) in SFM + 0,1% BSA 10 minutes later. In case of 
antagonist screening both, an appropriate concentration of agonist, and forskolin are added. The plates are incubated 

45 at 37°C in 10% C0 2 for 3 hours. Then the supernatant is removed, cells are lysed with lysis reagent (25 mmolar 
phosphate-buffer, pH 7,8, containing 2 mmolar DDT, 10% glycerol and 3% Triton X100). The luciferase reaction is 
started by addition of substrate-buffer (e.g. luciferase assay reagent, Promega) and luminescence is immediately de- 
termined (e.g. Berthold luminometer or Hamamatzu camera system). 

50 2. G ^ -coupled receptors 

[0287] Cells (such as CHO cells or primary cells) are stably transfected with the relevant receptor and with an in- 
ducible CRE-luciferase construct. Cells are grown in 50% Dulbecco's modified Eagle medium / 50% F12 (DMEM/F12) 
supplemented with 10% FBS, at 37°C in a humidified atmosphere with 10% C0 2 and are routinely split at a ratio of 1 : 
55 10 every 2 or 3 days. Test cultures are seeded into 384 - well plates at an appropriate density (e.g. 1000 or 2000 cells 
/well in 35 uJ cell culture medium) in DMEM/F1 2 with FBS, and are grown for48 hours (range:- 24 - 60 hours, depending 
on cell line). The assay is started by addition of test-compounds in serum free medium (SFM; e.g. Ultra-CHO) containing 
0,1% BSA: Test compounds are dissolved in DMSO, diluted in SFM and transferred to the test cultures (maximal final 
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concentration 1 0 umolar, DMSO cone. < 0,6 %). In case of antagonist screening an appropriate concentration of agonist 
is added 5-10 minutes later. The plates are incubated at 37°C in 10% C0 2 for 3 hours. Then the cells are lysed with 
10 uJ lysis reagent per well (25 mmolar phosphate-buffer, pH 7,8 , containing 2 mmolar DDT, 10% glycerol and 3% 
Triton X100) and the luciferase reaction is started by addition of 20 ul substrate-buffer per well (e.g. luciferase assay 
5 reagent, Promega). Measurement of luminescence is started immediately (e.g. Berthold luminometer or Hamamatzu 
camera system). 

3. -coupled receptors 

10 [0288] Cells (such as CHO cells or primary cells) are stably transfected with the relevant receptor. Cells expressing 
functional receptor protein are grown in 50% Dulbecco's modified Eagle medium/ 50% F12 (DMEM/F1 2) supplemented 
with 10% FBS, at 37°C in a humidified atmosphere with 5% C0 2 and are routinely split at a cell line dependent ratio 
every 3 or 4 days. Test cultures are seeded into 384 - well plates at an appropriate density (e.g. 2000 cells / well in 35 
ul cell culture medium) in DMEM/F12 with FBS, and are grown for 48 hours (range:- 24 - 60 hours, depending on cell 

15 line). Growth medium is then exchanged against physiological salt solution (e.g. Tyrode solution). Test compounds 
dissolved in DMSO are diluted in Tyrode solution containing 0.1% BSA and transferred to the test cultures (maximal 
final concentration 10 molar). After addition of the receptor specific agonist the resulting Gq-mediated intracellular 
calcium increase is measured using appropriate read-out systems (e.g. calcium-sensitive dyes). 

20 b) Ion channels 

[0289] Ion channels are integral membrane proteins involved in electrical signaling, transmembrane signal transduc- 
tion, and electrolyte and solute transport. By forming macromolecular pores through the membrane lipid bilayer, ion 
channels account for the flow of specific ion species driven by the electrochemical potential gradient for the permeating 

25 ion. At the single molecule level, individual channels undergo conformational transitions ("gating") between the 'open' 
(ion conducting) and 'closed' (non conducting) state. Typical single channel openings last for a few milliseconds and 
result in elementary transmembrane currents in the range of 10" 9 - 10" 12 Ampere. Channel gating is controlled by 
various chemical and/or biophysical parameters, such as neurotransmitters and intracellular second messengers ('lig- 
and-gated' channels) or membrane potential ('voltage-gated' channels). Ion channels are functionally characterized 

30 by their ion selectivity, gating properties, and regulation by hormones and pharmacological agents. Because of their 
central role in signaling and transport processes, ion channels present ideal targets for pharmacological therapeutics 
in various pathophysiological settings. 

[0290] In one embodiment, the "BREAST CANCER GENE" may encode an ion channel. In one embodiment, the 
present invention provides a method of screening potential activators or inhibitors of channels activity of the "BREAST 
35 CANCER GENE" polypeptide. Screening for compounds interaction with ion channels to either inhibit or promote their 
activity can be based on (1.) binding and (2.) functional assays in living cells[ Hille (183)]. 

1. For ligand-gated channels, e.g. ionotropic neurotransmitter/hormone receptors, assays can be designed de- 
tecting binding to the target by competition between the compound and a labeled ligand. 

2. Ion channel function can be tested functionally in living cells. Target proteins are either expressed endogenously 
in appropriate reporter cells or are introduced recombinantly. Channel activity can be monitored by (2.1) concen- 
tration changes of the permeating ion (most prominently Ca 2+ ions), (2.2) by changes in the transmembrane elec- 
trical potential gradient, and (2.3) by measuring a cellular response (e.g. expression of a reporter gene, secretion 

45 of a neurotransmitter) triggered or modulated by the target activity. 

2.1 Channel activity results in transmembrane ion fluxes. Thus activation of ionic channels can be monitored 
by the resulting changes in intracellular ion concentrations using luminescent or fluorescent indicators. Be- 
cause of its wide dynamic range and availability of suitable indicators this applies particularly to changes in 
50 intracellular Ca 2+ ion concentration ([Ca 2+ ]i). [Ca 2+ ]i can be measured, for example, by aequorin luminescence 

or fluorescence dye technology (e.g. using Fluo-3, lndo-1, Fura-2). Cellular assays can be designed where 
either the Ca 2+ flux through the target channel itself is measured directly or where modulation of the target 
channel affects membrane potential and thereby the activity of co-expressed voltage-gated Ca 2+ channels. 

55 2.2 Ion channel currents result in changes of electrical membrane potential (V m ) which can be monitored 

directly using potentiometric fluorescent probes. These electrically charged indicators (e.g. the anionic oxonol 
dye DiBAC 4 (3)) redistribute between extra- and intracellular compartment in response to voltage changes. 
The equilibrium distribution is governed by the Nemst-equation. Thus changes in membrane potential results 
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in concomitant changes in cellular fluorescence. Again, changes in V m might be caused directly by the activity 
of the target ion channel or through amplification and/or prolongation of the signal by channels co-expressed 
in the same cell. 

5 2.3 Target channel activity can cause cellular Ca 2+ entry either directly or through activation of additional Ca 2+ 

channel (see 2.1). The resulting intracellular Ca 2+ signals regulate a variety of cellular responses, e.g. secretion 
or gene transcription. Therefore modulation of the target channel can be detected by monitoring secretion of 
a known hormone/transmitter from the target-expressing cell or through expression of a reporter gene (e.g. 
luciferase) controlled by an Ca 2+ -responsive promoter element (e.g. cyclic AMP/ Ca2 + -responsive elements; 

10 CRE). 

c) DNA-binding proteins and transcription factors 

[0291] In one embodiment, the "BREAST CANCER GENE" may encode a DNA-binding protein or a transcription 
15 factor. The activity of such a DNA-binding protein or a transcription factor may be measured, for example, by a promoter 
assay which measures the ability of the DNA-binding protein or the transcription factor to initiate transcription of a test 
sequence linked to a particular promoter. In one embodiment, the present invention provides a method of screening 
test compounds for its ability to modulate the activity of such a DNA-binding protein or a transcription factor by meas- 
uring the changes in the expression of a test gene which is regulated by a promoter which is responsive to the tran- 
20 scription factor. 

d) Promotor assays 

[0292] A promoter assay was set up with a human hepatocellular carcinoma cell HepG2 that was stably transfected 
25 with a luciferase gene under the control of a gene of interest (e.g. thyroid hormone) regulated promoter. The vector 
2xlROIuc, which was used for transfection, carries a thyroid hormone responsive element (TRE) of two 12 bp inverted 
palindromes separated by an 8 bp spacer in front of a tk minimal promoter and the luciferase gene. Test cultures were 
seeded in 96 well plates in serum - free Eagle's Minimal Essential Medium supplemented with glutamine, tricine, sodium 
pyruvate, non - essential amino acids, insulin, selen, transferrin, and were cultivated in a humidified atmosphere at 10 
30 % C0 2 at 37°C. After 48 hours of incubation serial dilutions of test compounds or reference compounds (L-T3, L-T4 
e.g.) and co-stimulator if appropriate (final concentration 1 nM) were added to the cell cultures and incubation was 
continued for the optimal time (e.g. another 4-72 hours). The cells were then lysed by addition of buffer containing 
Triton X100 and luciferin and the luminescence of luciferase induced by T3 or other compounds was measured in a 
luminometer. For each concentration of a test compound replicates of 4 were tested. EC 50 — values for each test 
35 compound were calculated by use of the Graph Pad Prism Scientific software. 

Screening Methods 

[0293] The invention provides assays for screening test compounds which bind to or modulate the activity of a 
40 "BREAST CANCER GENE" polypeptide or a "BREAST CANCER GENE" polynucleotide. A test compound preferably 
binds to a "BREAST CANCER GENE" polypeptide or polynucleotide. More preferably, a test compound decreases or 
increases "BREAST CANCER GENE" activity by at least about 10, preferably about 50, more preferably about 75, 90, 
or 100% relative to the absence of the test compound. 

45 Test Compounds 

[0294] Test compounds can be pharmacological agents already known in the art or can be compounds previously 
unknown to have any pharmacological activity. The compounds can be naturally occurring or designed in the laboratory. 
They can be isolated from microorganisms, animals, or plants, and can be produced recombinant, or synthesised by 

50 chemical methods known in the art. If desired, test compounds can be obtained using any of the numerous combinatorial 
library methods known in the art, including but not limited to, biological libraries, spatially addressable parallel solid 
phase or solution phase libraries, synthetic library methods requiring de-convolution, the one-bead one-compound 
library method, and synthetic library methods using affinity chromatography selection. The biological library approach 
is limited to polypeptide libraries, while the other four approaches are applicable to polypeptide, non-peptide oligomer, 

55 or small molecule libraries of compounds. [For review see Lam, 1997, (151)]. 

[0295] Methods for the synthesis of molecular libraries are well known in the art [see, for example, DeWitt et al., 
1993, (152); Erb etal., 1994, (153); Zuckermann et al., 1994, (154); Cho etal., 1993, (155); Carell et al., 1994, (156) 
and Gallop etal., 1994, (157). Libraries of compounds can be presented in solution [see, e.g., Houghten, 1992, (158)], 
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or on beads [Lam, 1991, (159)], DNA-chips [Fodor, 1993, (160)], bacteria or spores (Ladner, U.S. Patent 5,223,409), 
plasmids [Cull et al., 1992, (161)], or phage [Scott & Smith, 1990, (162); Devlin, 1990, (163); Cwirla et al., 1990, (164); 
Felici, 1991, (165)]. 

5 High Throughput Screening 

[0296] Test compounds can be screened for the ability to bind to "BREAST CANCER GENE" polypeptides or poly- 
nucleotides or to affect "BREAST CANCER GENE" activity or "BREAST CANCER GENE" expression using high 
throughput screening. Using high throughput screening, many discrete compounds can be tested in parallel so that 
10 large numbers of test compounds can be quickly screened. The most widely established techniques utilize 96-well, 
384-well or 1536-well microtiter plates. The wells of the microtiter plates typically require assay volumes that range 
from 5 to 500 uJ. In addition to the plates, many instruments, materials, pipettors, robotics, plate washers, and plate 
readers are commercially available to fit the microwell formats. 

[0297] Alternatively, free format assays, or assays that have no physical barrier between samples, can be used. For 
15 example, an assay using pigment cells (melanocytes) in a simple homogeneous assay for combinatorial peptide li- 
braries is described by Jayawickreme et al., (166). The cells are placed under agarose in culture dishes, then beads 
that carry combinatorial compounds are placed on the surface of the agarose. The combinatorial compounds are 
partially released the compounds from the beads. Active compounds can be visualised as dark pigment areas because, 
as the compounds diffuse locally into the gel matrix, the active compounds cause the cells to change colors. 
20 [0298] Another example of a free format assay is described by Chelsky, (1 67). Chelsky placed a simple homogenous 
enzyme assay for carbonic anhydrase inside an agarose gel such that the enzyme in the gel would cause a color 
change throughout the gel. 

[0299] Thereafter, beads carrying combinatorial compounds via a photolinker were placed inside the gel and the 
compounds were partially released by UV light. Compounds that inhibited the enzyme were observed as local zones 
25 of inhibition having less color change. 

[0300] In another example, combinatorial libraries were screened for compounds that had cytotoxic effects on cancer 
cells growing in agar [Salmon et al., 1996, (168)]. 

[0301] Another high throughput screening method is described in Beutel etal., U.S. Patent 5,976,81 3. In this method, 
test samples are placed in a porous matrix. One or more assay components are then placed within, on top of, or at 
30 the bottom of a matrix such as a gel, a plastic sheet, a filter, or other form of easily manipulated solid support. When 
samples are introduced to the porous matrix they diffuse sufficiently slowly, such that the assays can be performed 
without the test samples running together. 

Binding Assays 

35 

[0302] For binding assays, the test compound is preferably a small molecule which binds to and occupies, for ex- 
ample, the ATP/GTP binding site of the enzyme or the active site of a "BREAST CANCER GENE" polypeptide, such 
that normal biological activity is prevented. Examples of such small molecules include, but are not limited to, small 
peptides or peptide-like molecules. 

40 [0303] In binding assays, either the test compound or a "BREAST CANCER GENE" polypeptide can comprise a 
detectable label, such as a fluorescent, radioisotopic, chemiluminescent, or enzymatic label, such as horseradish per- 
oxidase, alkaline phosphatase, or luciferase. Detection of a test compound which is bound to a "BREAST CANCER 
GENE" polypeptide can then be accomplished, for example, by direct counting of radioemmission, by scintillation count- 
ing, or by determining conversion of an appropriate substrate to a detectable product. 

45 [0304] Alternatively, binding of a test compound to a "BREAST CANCER GENE" polypeptide can be determined 
without labeling either of the interactants. For example, a microphysiometer can be used to detect binding of a test 
compound with a "BREAST CANCER GENE" polypeptide. A microphysiometer (e.g., CytosensorJ) is an analytical 
instrument that measures the rate at which a cell acidifies its environment using a light-addressable potentiometric 
sensor (LAPS). Changes in this acidification rate can be used as an indicator of the interaction between a test compound 

50 and a "BREAST CANCER GENE" polypeptide [McConnell et al., 1992, (169)]. 

[0305] Determining the ability of a test compound to bind to a "BREAST CANCER GENE" polypeptide also can be 
accomplished using a technology such as real-time Bimolecular Interaction Analysis (BIA) [Sjolander & Urbaniczky, 
1991, (170), and Szaboetal., 1995, (171)]. BIA is a technology for studying biospecific interactions in real time, without 
labeling any of the interactants (e.g., BIAcore™). Changes in the optical phenomenon surface plasmon resonance 

55 (SPR) can be used as an indication of real-time reactions between biological molecules. 

[0306] In yet another aspect of the invention, a "BREAST CANCER GENE" polypeptide can be used as a "bait 
protein" in a two-hybrid assay or three-hybrid assay [see, e.g., U.S. Patent 5,283,317; Zervos et al., 1993, (172); 
Madura et al., 1993, (173); Bartel et al., 1993, (174); Iwabuchi et al., 1993, (175) and Brent WO 94/10300], to identify 
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other proteins which bind to or interact with the "BREAST CANCER GENE" polypeptide and modulate its activity. 
[0307] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable 
DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. For example, in one 
construct, polynucleotide encoding a "BREAST CANCER GENE" polypeptide can be fused to a polynucleotide encod- 

5 ing the DNA binding domain of a known transcription factor (e.g., GAL4). In the other construct a DNA sequence that 
encodes an unidentified protein ("prey" or "sample") can be fused to a polynucleotide that codes for the activation 
domain of the known transcription factor. If the "bait" and the "prey" proteins are able to interact in vivo to form an 
protein- dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close 
proximity. This proximity allows transcription of a reporter gene (e.g., LacZ), which is operably linked to a transcriptional 

10 regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected, and cell colonies 
containing the functional transcription factor can be isolated and used to obtain the DNA sequence encoding the protein 
which interacts with the "BREAST CANCER GENE" polypeptide. 

[0308] It may be desirable to immobilize either a "BREAST CANCER GENE" polypeptide (or polynucleotide) or the 
test compound to facilitate separation of bound from unbound forms of one or both of the interactants, as well as to 

15 accommodate automation of the assay. Thus, either a "BREAST CANCER GENE" polypeptide (or polynucleotide) or 
the test compound can be bound to a solid support. Suitable solid supports include, but are not limited to, glass or 
plastic slides, tissue culture plates, microtiter wells, tubes, silicon chips, or particles such as beads (including, but not 
limited to, latex, polystyrene, or glass beads). Any method known in the art can be used to attach a "BREAST CANCER 
GENE" polypeptide (or polynucleotide) or test compound to a solid support, including use of covalent and non-covalent 

20 linkages, passive absorption, or pairs of binding moieties attached respectively to the polypeptide (or polynucleotide) 
or test compound and the solid support. Test compounds are preferably bound to the solid support in an array, so that 
the location of individual test compounds can be tracked. Binding of a test compound to a "BREAST CANCER GENE" 
polypeptide (or polynucleotide) can be accomplished in any vessel suitable for containing the reactants. Examples of 
such vessels include microtiter plates, test tubes, and microcentrifuge tubes. 

25 [0309] In one embodiment, a "BREAST CANCER GENE" polypeptide is a fusion protein comprising a domain that 
allows the "BREAST CANCER GENE" polypeptide to be bound to a solid support. For example, glutathione S-trans- 
ferase fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glu- 
tathione derivatized microtiter plates, which are then combined with the test compound or the test compound and the 
nonadsorbed "BREAST CANCER GENE" polypeptide; the mixture is then incubated under conditions conducive to 

30 complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads or microtiter plate 
wells are washed to remove any unbound components. Binding of the interactants can be determined either directly 
or indirectly, as described above. Alternatively, the complexes can be dissociated from the solid support before binding 
is determined. 

[0310] Other techniques for immobilising proteins or polynucleotides on a solid support also can be used in the 
35 screening assays of the invention. For example, either a "BREAST CANCER GENE" polypeptide (or polynucleotide) 
or a test compound can be immobilized utilizing conjugation of biotin and streptavidin. Biotinylated "BREAST CANCER 
GENE" polypeptides (or polynucleotides) or test compounds can be prepared from biotin NHS (N-hydroxysuccinimide) 
using techniques well known in the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, III.) and immobilized in the 
wells of streptavidin-coated 96 well plates (Pierce Chemical). Alternatively, antibodies which specifically bind to a 
40 "BREAST CANCER GENE" polypeptide, polynucleotide, or a test compound, but which do not interfere with a desired 
binding site, such as the ATP/GTP binding site or the active site of the "BREAST CANCER GENE" polypeptide, can 
be derivatised to the wells of the plate. Unbound target or protein can be trapped in the wells by antibody conjugation. 
[0311] Methods for detecting such complexes, in addition to those described above for the GST-immobilized com- 
plexes, include immunodetection of complexes using antibodies which specifically bind to a "BRBAST CANCER GENE" 
45 polypeptide or test compound, enzyme-linked assays which rely on detecting an activity of a "BREAST CANCER 
GENE" polypeptide, and SDS gel electrophoresis under non-reducing conditions. 

[0312] Screening for test compounds which bind to a "BREAST CANCER GENE" polypeptide or polynucleotide also 
can be carried out in an intact cell. Any cell which comprises a "BREAST CANCER GENE" polypeptide or polynucleotide 
can be used in a cell-based assay system. A "BREAST CANCER GENE" polynucleotide can be naturally occurring in 
so the cell or can be introduced using techniques such as those described above. Binding of the test compound to a 
"BREAST CANCER GENE" polypeptide or polynucleotide is determined as described above. 

Modulation of Gene Expression 

55 [0313] In another embodiment, test compounds which increase or decrease "BREAST CANCER GENE" expression 
are identified. A "BREAST CANCER GENE" polynucleotide is contacted with a test compound, and the expression of 
an RNA or polypeptide product of the "BREAST CANCER GENE" polynucleotide is determined. The level of expression 
of appropriate mRNA or polypeptide in the presence of the test compound is compared to the level of expression of 
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mRNA or polypeptide in the absence of the test compound. The test compound can then be identified as a modulator 
of expression based on this comparison. For example, when expression of mRNA or polypeptide is greater in the 
presence of the test compound than in its absence, the test compound is identified as a stimulator or enhancer of the 
mRNA or polypeptide expression. Alternatively, when expression of the mRNA or polypeptide is less in the presence 
5 of the test compound than in its absence, the test compound is identified as an inhibitor of the mRNA or polypeptide 
expression. 

[0314] The level of "BREAST CANCER GENE" mRNA or polypeptide expression in the cells can be determined by 
methods well known in the art for detecting mRNA or polypeptide. Either qualitative or quantitative methods can be 
used. The presence of polypeptide products of a "BREAST CANCER GENE" polynucleotide can be determined, for 
10 example, using a variety of techniques known in the art, including immunochemical methods such as radioimmu- 
noassay, Western blotting, and immunohistochemistry. Alternatively, polypeptide synthesis can be determined in vivo, 
in a cell culture, or in an in vitro translation system by detecting incorporation of labeled amino acids into a "BREAST 
CANCER GENE" polypeptide. 

[0315] Such screening can be carried out either in a cell-free assay system or in an intact cell. Any cell which ex- 
15 presses a "BREAST CANCER GENE" polynucleotide can be used in a cell-based assay system. A "BREAST CANCER 
GENE" polynucleotide can be naturally occurring in the cell or can be introduced using techniques such as those 
described above. Either a primary culture or an established cell line, such as CHO or human embryonic kidney 293 
cells, can be used. 

20 Therapeutic Indications and Methods 

[0316] Therapies for treatment of breast cancer primarily relied upon effective chemotherapeutic drugs for interven- 
tion on the cell proliferation, cell growth orangiogenesis. The advent of genomics-driven molecular target identification 
has opened up the possibility of identifying new breast cancer-specific targets for therapeutic intervention that will 

25 provide safer, more effective treatments for malignant neoplasia patients and breast cancer patients in particular. Thus, 
newly discovered breast cancer-associated genes and their products can be used as tools to develop innovative ther- 
apies. The identification of the Her2/neu receptor kinase presents exciting new opportunities for treatment of a certain 
subset of tumor patients as described before. Genes playing important roles in any of the physiological processes 
outlined above can be characterized as breast cancer targets. Genes or gene fragments identified through genomics 

30 can readily be expressed in one or more heterologous expression systems to produce functional recombinant proteins. 
These proteins are characterized in vitro for their biochemical properties and then used as tools in high-throughput 
molecular screening programs to identify chemical modulators of their biochemical activities. Modulators of target gene 
expression or protein activity can be identified in this manner and subsequently tested in cellular and in vivo disease 
models for therapeutic activity. Optimization of lead compounds with iterative testing in biological models and detailed 

35 pharmacokinetic and toxicological analyses form the basis for drug development and subsequent testing in humans. 
[0317] This invention further pertains to the use of novel agents identified by the screening assays described above. 
Accordingly, it is within the scope of this invention to use a test compound identified as described herein in an appropriate 
animal model. For example, an agent identified as described herein (e.g., a modulating agent, an antisense polynu- 
cleotide molecule, a specific antibody, ribozyme, or a human "BREAST CANCER GENE" polypeptide binding molecule) 

40 can be used in an animal model to determine the efficacy, toxicity, or side effects of treatment with such an agent. 
Alternatively, an agent identified as described herein can be used in an animal model to determine the mechanism of 
action of such an agent. Furthermore, this invention pertains to uses of novel agents identified by the above described 
screening assays for treatments as described herein. 

[0318] A reagent which affects human "BREAST CANCER GENE" activity can be administered to a human cell, 
45 either in vitro or in vivo, to reduce or increase human "BREAST CANCER GENE" activity. The reagent preferably binds 
to an expression product of a human "BREAST CANCER GENE". If the expression product is a protein, the reagent 
is preferably an antibody. For treatment of human cells ex vivo, an antibody can be added to a preparation of stem 
cells which have been removed from the body. The cells can then be replaced in the same or another human body, 
with or without clonal propagation, as is known in the art. 
50 [0319] In one embodiment, the reagent is delivered using a liposome. Preferably, the liposome is stable in the animal 
into which it has been administered for at least about 30 minutes, more preferably for at least about 1 hour, and even 
more preferably for at least about 24 hours. A liposome comprises a lipid composition that is capable of targeting a 
reagent, particularly a polynucleotide, to a particular site in an animal, such as a human. Preferably, the lipid composition 
of the liposome is capable of targeting to a specific organ of an animal, such as the lung, liver, spleen, heart brain, 
55 lymph nodes, and skin. 

[0320] A liposome useful in the present invention comprises a lipid composition that is capable of fusing with the 
plasma membrane of the targeted cell to deliver its contents to the cell. Preferably, the transfection efficiency of a 
liposome is about 0.5 \ig of DNA per 1 6 nmol of liposome delivered to about 1 0 6 cells, more preferably about 1 .0 ug 
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of DNA per 16 nmol of liposome delivered to about 10 6 cells, and even more preferably about 2.0 ug of DNA per 16 
nmol of liposome delivered to about 10 6 cells. Preferably, a liposome is between about 100 and 500 nm, more preferably 
between about 150 and 450 nm, and even more preferably between about 200 and 400 nm in diameter. 
[0321] Suitable liposomes for use in the present invention include those liposomes usually used in, for example, 
5 gene delivery methods known to those of skill in the art. More preferred liposomes include liposomes having a poly- 
cationic lipid composition and/or liposomes having a cholesterol backbone conjugated to polyethylene glycol. Option- 
ally, a liposome comprises a compound capable of targeting the liposome to a particular cell type, such as a cell-specific 
ligand exposed on the outer surface of the liposome. 

[0322] Complexing a liposome with a reagent such as an antisense oligonucleotide or ribozyme can be achieved 
10 using methods which are standard in the art (see, for example, U.S. Patent 5,705,151). Preferably, from about 0.1 ug 
to about 10 ug of polynucleotide is combined with about 8 nmol of liposomes, more preferably from about 0.5 ug to 
about 5 ug of polynucleotides are combined with about 8 nmol liposomes, and even more preferably about 1 .0 ug of 
polynucleotides is combined with about 8 nmol liposomes. 

[0323] In another embodiment, antibodies can be delivered to specific tissues in vivo using receptor-mediated tar- 
15 geted delivery. Receptor-mediated DNA delivery techniques are taught in, for example, Findeis et al., 1993, (176); 
Chiou et al. , 1994,(1 77); Wu&Wu, 1988,(1 78); Wu etal., 1994, (179); Zenke et al., 1990, (180); Wu etal., 1991, (181). 

Determination of a Therapeutically Effective Dose 

20 [0324] The determination of a therapeutically effective dose is well within the capability of those skilled in the art. A 
therapeutically effective dose refers to that amount of active ingredient which increases or decreases human "BREAST 
CANCER GENE" activity relative to the human "BREAST CANCER GENE" activity which occurs in the absence of the 
therapeutically effective dose. 

[0325] For any compound, the therapeutically effective dose can be estimated initially either in cell culture assays 
25 or in animal models, usually mice, rabbits, dogs, or pigs. The animal model also can be used to determine the appro- 
priate concentration range and route of administration. Such information can then be used to determine useful doses 
and routes for administration in humans. 

[0326] Therapeutic efficacy and toxicity, e.g., ED 50 (the dose therapeutically effective in 50% of the population) and 
LD 50 (the dose lethal to 50% of the population), can be determined by standard pharmaceutical procedures in cell 
30 cultures or experimental animals. The dose ratio of toxic to therapeutic effects is the therapeutic index, and it can be 
expressed as the ratio, LD 5o /ED 5o . 

[0327] Pharmaceutical compositions which exhibit large therapeutic indices are preferred. The data obtained from 
cell culture assays and animal studies is used in formulating a range of dosage for human use. The dosage contained 
in such compositions is preferably within a range of circulating concentrations that include the ED 50 with little or no 
35 toxicity. The dosage varies within this range depending upon the dosage form employed, sensitivity of the patient, and 
the route of administration. 

[0328] The exact dosage will be determined by the practitioner, in light of factors related to the subject that requires 
treatment. Dosage and administration are adjusted to provide sufficient levels of the active ingredient or to maintain 
the desired effect. Factors which can be taken into account include the severity of the disease state, general health of 
40 the subject, age, weight, and gender of the subject, diet, time and frequency of administration, drug combination(s), 
reaction sensitivities, and tolerance/response to therapy. Long-acting pharmaceutical compositions can be adminis- 
tered every 3 to 4 days, every week, or once every two weeks depending on the half-life and clearance rate of the 
particular formulation. 

[0329] Normal dosage amounts can vary from 0.1 to 100,000 micrograms, up to a total dose of about 1 g, depending 
45 upon the route of administration. Guidance as to particular dosages and methods of delivery is provided in the literature 
and generally available to practitioners in the art. Those skilled in the art will employ different formulations for nucleotides 
than for proteins or their inhibitors. Similarly, delivery of polynucleotides or polypeptides will be specific to particular 
cells, conditions, locations, etc. 

[0330] If the reagent is a single-chain antibody, polynucleotides encoding the antibody can be constructed and in- 
50 traduced into a cell either ex vivo or in vivo using well-established techniques including, but not limited to, transferrin- 
polycation-mediated DNA transfer, transfection with naked or encapsulated nucleic acids, liposome-mediated cellular 
fusion, intracellular transportation of DNA-coated latex beads, protoplast fusion, viral infection, electroporation, a gene 
gun, and DEAE- or calcium phosphate-mediated transfection. 

[0331 ] Effective in vivo dosages of an antibody are in the range of about 5 ug to about 50 ug/kg, about 50 ug to about 
55 5 mg/kg, about 100 ug to about 500 ug/kg of patient body weight, and about 200 to about 250 ug/kg of patient body 
weight. For administration of polynucleotides encoding single-chain antibodies, effective in vivo dosages are in the 
range of about 100 ng to about 200 ng, 500 ng to about 50 mg, about 1 ug to about 2 mg, about 5 ug to about 500 ug, 
and about 20 ug to about 100 ug of DNA. 
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[0332] If the expression product is mRNA, the reagent is preferably an antisense oligonucleotide or a ribozyme. 
Polynucleotides which express antisense oligonucleotides or ribozymes can be introduced into cells by a variety of 
methods, as described above. 

[0333] Preferably, a reagent reduces expression of a "BREAST CANCER GENE" gene or the activity of a "BREAST 
5 CANCER GENE" polypeptide by at least about 10, preferably about 50, more preferably about 75, 90, or 100% relative 
to the absence of the reagent. The effectiveness of the mechanism chosen to decrease the level of expression of a 
"BREAST CANCER GENE" gene or the activity of a "BREAST CANCER GENE" polypeptide can be assessed using 
methods well known in the art, such as hybridization of nucleotide probes to "BREAST CANCER GENE"-specific mR- 
NA, quantitative RT-PCR, immunologic detection of a "BREAST CANCER GENE" polypeptide, or measurement of 
10 "BREAST CANCER GENE" activity. 

[0334] In any of the embodiments described above, any of the pharmaceutical compositions of the invention can be 
administered in combination with other appropriate therapeutic agents. Selection of the appropriate agents for use in 
combination therapy can be made by one of ordinary skill in the art, according to conventional pharmaceutical principles. 
The combination of therapeutic agents can act synergistically to effect the treatment or prevention of the various dis- 
15 orders described above. Using this approach, one may be able to achieve therapeutic efficacy with lower dosages of 
each agent, thus reducing the potential for adverse side effects. 

[0335] Any of the therapeutic methods described above can be applied to any subject in need of such therapy, 
including, for example, birds and mammals such as dogs, cats, cows, pigs, sheep, goats, horses, rabbits, monkeys, 
and most preferably, humans. 

20 [0336] All patents and patent applications cited in this disclosure are expressly incorporated herein by reference. 
The above disclosure generally describes the present invention. A more complete understanding can be obtained by 
reference to the following specific examples which are provided for purposes of illustration only and are not intended 
to limit the scope of the invention. 

25 Pharmaceutical Compositions 

[0337] The invention also provides pharmaceutical compositions which can be administered to a patient to achieve 
a therapeutic effect. Pharmaceutical compositions of the invention can comprise, for example, a "BREAST CANCER 
GENE" polypeptide, "BREAST CANCER GENE" polynucleotide, ribozymes or antisense oligonucleotides, antibodies 

30 which specifically bind to a "BREAST CANCER GENE" polypeptide, or mimetics, agonists, antagonists, or inhibitors 
of a "BREAST CANCER GENE" polypeptide activity. The compositions can be administered alone or in combination 
with at least one other agent, such as stabilizing compound, which can be administered in any sterile, biocompatible 
pharmaceutical carrier, including, but not limited to, saline, buffered saline, dextrose, and water. The compositions can 
be administered to a patient alone, or in combination with other agents, drugs or hormones. 

35 [0338] In addition to the active ingredients, these pharmaceutical compositions can contain suitable pharmaceutical^ 
acceptable carriers comprising excipients and auxiliaries which facilitate processing of the active compounds into prep- 
arations which can be used pharmaceutical^. Pharmaceutical compositions of the invention can be administered by 
any number of routes including, but not limited to, oral, intravenous, intramuscular, intraarterial, intramedullary, intrath- 
ecal, intraventricular, transdermal, subcutaneous, intraperitoneal, intranasal, parenteral, topical, sublingual, or rectal 

40 means. Pharmaceutical compositions for oral administration can be formulated using pharmaceutical^ acceptable 
carriers well known in the art in dosages suitable for oral administration. Such carriers enable the pharmaceutical 
compositions to be formulated as tablets, pills, dragees, capsules, liquids, gels, syrups, slurries, suspensions, and the 
like, for ingestion by the patient. 

[0339] Pharmaceutical preparations for oral use can be obtained through combination of active compounds with 
45 solid excipient, optionally grinding a resulting mixture, and processing the mixture of granules, after adding suitable 
auxiliaries, if desired, to obtain tablets or dragee cores, suitable excipients are carbohydrate or protein fillers, such as 
sugars, including lactose, sucrose, mannitol, or sorbitol; starch from corn, wheat, rice, potato, or other plants; cellulose, 
such as methyl cellulose, hydroxypropylmethylcellulose, or sodium carboxymethylcellulose; gums including arabic and 
tragacanth; and proteins such as gelatin and collagen. If desired, disintegrating or solubilizing agents can be added, 
so such as the cross-linked polyvinyl pyrrolidone, agar, alginic acid, or a salt thereof, such as sodium alginate. 

[0340] Dragee cores can be used in conjunction with suitable coatings, such as concentrated sugar solutions, which 
also can contain gum arabic, talc, polyvinylpyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lac- 
quer solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments can be added to the tablets 
or dragee coatings for product identification or to characterize the quantity of active compound, i.e., dosage. 
55 [0341] Pharmaceutical preparations which can be used orally include push-fit capsules made of gelatin, as well as 
soft, sealed capsules made of gelatin and a coating, such as glycerol or sorbitol. Push-fit capsules can contain active 
ingredients mixed with a filler or binders, such as lactose or starches, lubricants, such as talc or magnesium stearate, 
and, optionally, stabilizers. In soft capsules, the active compounds can be dissolved or suspended in suitable liquids, 
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such as fatty oils, liquid, or liquid polyethylene glycol with or without stabilizers. 

[0342] Pharmaceutical formulations suitable for parenteral administration can be formulated in aqueous solutions, 
preferably in physiologically compatible buffers such as Hanks' solution, Ringer's solution, or physiologically buffered 
saline. Aqueous injection suspensions can contain substances which increase the viscosity of the suspension, such 

5 as sodium carboxymethyl cellulose, sorbitol, or dextran. Additionally, suspensions of the active compounds can be 
prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or vehicles include fatty oils such as 
sesame oil, or synthetic fatty acid esters, such as ethyl oleate or triglycerides, or liposomes. Non-lipid polycationic 
amino polymers also can be used for delivery. Optionally, the suspension also can contain suitable stabilizers or agents 
which increase the solubility of the compounds to allow for the preparation of highly concentrated solutions. For topical 

10 or nasal administration, penetrants appropriate to the particular barrier to be permeated are used in the formulation. 
Such penetrants are generally known in the art. 

[0343] The pharmaceutical compositions of the present invention can be manufactured in a manner that is known 
in the art, e.g., by means of conventional mixing, dissolving, granulating, dragee making, levigating, emulsifying, en- 
capsulating, entrapping, or lyophilizing processes. The pharmaceutical composition can be provided as a salt and can 

15 be formed with many acids, including but not limited to, hydrochloric, sulfuric, acetic, lactic, tartaric, malic, succinic, 
etc. Salts tend to be more soluble in aqueous or other protonic solvents than are the corresponding free base forms. 
In other cases, the preferred preparation can be a lyophilized powder which can contain any or all of the following: 150 
mM histidine, 0.1 %2% sucrose, and 27% mannitol, at a pH range of 4.5 to 5.5, that is combined with buffer prior to use. 
[0344] Further details on techniques for formulation and administration can be found in the latest edition of REM- 

20 INGTON'S PHARMACEUTICAL SCIENCES (182). After pharmaceutical compositions have been prepared, they can 
be placed in an appropriate container and labeled for treatment of an indicated condition. Such labeling would include 
amount, frequency, and method of administration. 

Material and Methods 

25 

[0345] One strategy for identifying genes that are involved in breast cancer is to detect genes that are expressed 
differentially under conditions associated with the disease versus non-disease conditions. The sub-sections below 
describe a number of experimental systems which may be used to detect such differentially expressed genes. In gen- 
eral, these experimental systems include at least one experimental condition in which subjects or samples are treated 
30 in a manner associated with breast cancer, in addition to at least one experimental control condition lacking such 
disease associated treatment. Differentially expressed genes are detected, as described below, by comparing the 
pattern of gene expression between the experimental and control conditions. 

[0346] Once a particular gene has been identified through the use of one such experiment, its expression pattern 
may be further characterized by studying its expression in a different experiment and the findings may be validated by 
35 an independent technique. Such use of multiple experiments may be useful in distinguishing the roles and relative 
importance of particular genes in breast cancer. A combined approach, comparing gene expression pattern in cells 
derived from breast cancer patients to those of in vitro cell culture models can give substantial hints on the pathways 
involved in development and/or progression of breast cancer. 

[0347] Among the experiments which may be utilized for the identification of differentially expressed genes involved 
40 in malignant neoplasia and breast cancer, for example, are experiments designed to analyze those genes which are 
involved in signal transduction. Such experiments may serve to identify genes involved in the proliferation of cells. 
[0348] Below are methods described for the identification of genes which are involved in breast cancer. Such rep- 
resent genes which are differentially expressed in breast cancer conditions relative to their expression in normal, or 
non-breast cancer conditions or upon experimental manipulation based on clinical observations. Such differentially 
45 expressed genes represent "target" and/or "marker" genes. Methods for the further characterization of such differen- 
tially expressed genes, and for their identification as target and/or marker genes, are presented below. 
[0349] Alternatively, a differentially expressed gene may have its expression modulated, i.e., quantitatively increased 
or decreased, in normal versus breast cancer states, or under control versus experimental conditions. The degree to 
which expression differs in normal versus breast cancer or control versus experimental states need only be large 
50 enough to be visualized via standard characterization techniques, such as, for example, the differential display tech- 
nique described below. Other such standard characterization techniques by which expression differences may be vis- 
ualized include but are not limited to quantitative RT-PCR and Northern analyses, which are well known to those of 
skill in the art. 

55 
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EXAMPLE 1 

Expression profiling 

5 a) Expression profiling utilizing quantitative RT-PCR 

[0350] For a detailed analysis of gene expression by quantitative PCR methods, one will utilize primers flanking the 
genomic region of interest and a fluorescent labeled probe hybridizing in-between. Using the PRISM 7700 Sequence 
Detection System of PE Applied Biosystems (Perkin Elmer, Foster City, CA, USA) with the technique of a fluorogenic 

10 probe, consisting of an oligonucleotide labeled with both a fluorescent reporter dye and a quencher dye, one can 
perform such a expression measurement. Amplification of the probe-specific product causes cleavage of the probe, 
generating an increase in reporter fluorescence. Primers and probes were selected using the Primer Express software 
and localized mostly in the 3' region of the coding sequence or in the 3' untranslated region (see Table 5 for primer- 
and probe- sequences) according to the relative positions of the probe sequence used for the construction of the 

15 Affymetrix HG_U95A-E or HG-U133A-B DNA-chips. All primer pairs were checked for specificity by conventional PCR 
reactions. To standardize the amount of sample RNA, GAPDH was selected as a reference, since it was not differentially 
regulated in the samples analyzed. TaqMan validation experiments were performed showing that the efficiencies of 
the target and the control amplifications are approximately equal which is a prerequisite for the relative quantification 
of gene expression by the comparative AAC T method, known to those with skills in the art. 

20 [0351 ] As well as the technology provided by Perkin Elmer one may use other technique implementations like Light- 
cycler™ from Roche Inc. or iCycler from Stratagene Inc.. 

b) Expression profiling utilizing DNA microarmys 

25 [0352] Expression profiling can bee carried out using the Affymetrix Array Technology. By hybridization of mRNA to 
such a DNA-array or DNA-Chip, it is possible to identify the expression value of each transcripts due to signal intensity 
at certain position of the array. Usually these DNA-arrays are produced by spotting of cDNA, oligonucleotides or sub- 
cloned DNA fragments. In case of Affymetrix technology app. 400.000 individual oligonucleotide sequences were syn- 
thesized on the surface of a silicon wafer at distinct positions. The minimal length of oligomers is 12 nucleotides, 

30 preferable 25 nucleotides or full length of the questioned transcript. Expression profiling may also be carried out by 
hybridization to nylon or nitrocellulose membrane bound DNA or oligonucleotides. Detection of signals derived from 
hybridization may be obtained by either colorimetric, fluorescent, electrochemical, electronic, optic or by radioactive 
readout. Detailed description of array construction have been mentioned above and in other patents cited. To determine 
the quantitative and qualitative changes in the chromosomal region to analyze, RNA from tumor tissue which is sus- 

35 pected to contain such genomic alterations has to be compared to RNA extracted from benign tissue (e.g. epithelial 
breast tissue, or micro dissected ductal tissue) on the basis of expression profiles for the whole transcriptome. With 
minor modifications, the sample preparation protocol followed the Affymetrix GeneChip Expression Analysis Manual 
(Santa Clara, CA). Total RNA extraction and isolation from tumor or benign tissues, biopsies, cell isolates or cell con- 
taining body fluids can be performed by using TRIzol (Life Technologies, Rockville, MD) and Oligotex mRNA Midi kit 

40 (Qiagen, Hilden, Germany), and an ethanol precipitation step should be carried out to bring the concentration to 1 mg/ 
ml. Using 5-10 mg of mRNA to create double stranded cDNA by the Superscript system (Life Technologies). First 
strand cDNA synthesis was primed with a T7-(dT24) oligonucleotide. The cDNA can be extracted with phenol/chloro- 
form and precipitated with ethanol to a final concentration of 1 mg /ml. From the generated cDNA, cRNA can be syn- 
thesized using Enzo's (Enzo Diagnostics Inc., Farmingdale, NY) in vitro Transcription Kit. Within the same step the 

45 cRNA can be labeled with biotin nucleotides Bio-11-CTP and Bio-16-UTP (Enzo Diagnostics Inc., Farmingdale, NY) . 
After labeling and cleanup (Qiagen, Hilden (Germany) the cRNA then should be fragmented in an appropriated frag- 
mentation buffer (e.g., 40 mM Tris-Acetate, pH 8.1, 100 mM KOAc, 30 mM MgOAc, for 35 minutes at 94°C). As per 
the Affymetrix protocol, fragmented cRNA should be hybridized on the HG_U133 arrays A and B, comprising app. 
40.000 probed transcripts each, for 24 hours at 60 rpm in a 45°C hybridization oven. After Hybridization step the chip 

so surfaces have to be washed and stained with streptavidin phycoerythrin (SAPE; Molecular Probes, Eugene, OR) in 
Affymetrix fluidics stations. To amplify staining, a second labeling step can be introduced, which is recommended but 
not compulsive. Here one should add SAPE solution twice with an antistreptavidin biotinylated antibody. Hybridization 
to the probe arrays may be detected by fluorometric scanning (Hewlett Packard Gene Array Scanner; Hewlett Packard 
Corporation, Palo Alto, CA). 

55 [0353] After hybridization and scanning, the microarray images can be analyzed for quality control, looking for major 
chip defects or abnormalities in hybridization signal. Therefor either Affymetrix GeneChip MAS 5.0 Software or other 
microarray image analysis software can be utilized. Primary data analysis should be carried out by software provided 
by the manufacturer.. 
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[0354] In case of the genes analyses in one embodiment of this invention the primary data have been analyzed by 
further bioinformatic tools and additional filter criteria. The bioinformatic analysis is described in detail below. 

c) Data analysis 

5 

[0355] According to Affymetrix measurement technique (Affymetrix GeneChip Expression Analysis Manual, Santa 
Clara, CA) a single gene expression measurement on one chip yields the average difference value and the absolute 
call. Each chip contains 16-20 oligonucleotide probe pairs per gene orcDNA clone. These probe pairs include perfectly 
matched sets and mismatched sets, both of which are necessary for the calculation of the average difference, or 

10 expression value, a measure of the intensity difference for each probe pair, calculated by subtracting the intensity of 
the mismatch from the intensity of the perfect match. This takes into consideration variability in hybridization among 
probe pairs and other hybridization artifacts that could affect the fluorescence intensities. The average difference is a 
numeric value supposed to represent the expression value of that gene. The absolute call can take the values 'A' 
(absent), M (marginal), or 'P' (present) and denotes the quality of a single hybridization. We used both the quantitative 

15 information given by the average difference and the qualitative information given by the absolute call to identify the 
genes which are differentially expressed in biological samples from individuals with breast cancer versus biological 
samples from the normal population. With other algorithms than the Affymetrix one we have obtained different numerical 
values representing the same expression values and expression differences upon comparison. 
[0356] The differential expression E in one of the breast cancer groups compared to the normal population is calcu- 

20 lated as follows. Given n average difference values d.,, d 2 , d n in the breast cancer population and m average dif- 
ference values a,, c 2 , c m in the population of normal individuals, it is computed by the equation: 



30 if dj<50 or C|<50 for one or more values of i and j, these particular values Cj and/or dj are set to an "artificial" expression 
value of 50. These particular computation of E allows for a correct comparison to TaqMan results. 
[0357] A gene is called up-regulated in breast cancer versus normal if E>1 .5 and if the number of absolute calls 
equal to 'P' in the breast cancer population is greater than n/2. 

[0358] A gene is called down-regulated in breast cancer versus normal if E<1 .5 and if the number of absolute calls 

35 equal to 'P' in the normal population is greater than m/2. 

[0359] The final list of differentially regulated genes consists of all up-regulated and all down-regulated genes in 
biological samples from individuals with breast cancer versus biological samples from the normal population. Those 
genes on this list which are interesting for a pharmaceutical application were finally validated by TaqMan. If a good 
correlation between the expression values/behavior of a transcript could be observed with both techniques, such a 

40 gene is listed in Tables 1 to 3. 

[0360] Since not only the information on differential expression of a single gene within an identified ARCHEON, but 
also the information on the co-regulation of several members is important for predictive, diagnostic, preventive and 
therapeutic purposes we have combined expression data with information on the chromosomal position (e.g. golden 
path) taken from public available databases to develop a picture of the overall transcriptom of a given tumor sample. 

45 By this technique not only known or suspected regions of genomes can be inspected but even more valuable, new 
regions of disregulation with chromosomal linkage can be identified. This is of value in other types of neoplasia or viral 
integration and chromosomal rearrangements. By SQL based database searches one can retrieve information on 
expression, qualitative value of a measurement (denoted by Affymetrix MAS 5.0 Software), expression values derived 
from other techniques than DNA-chip hybridization and chromosomal linkage. 



EXAMPLE 2 

Identification of the ARCHEON 

55 a) Identification and localization of genes or gene probes (represented by the so called probe sets on Affymetrix arrays 
HG-U95A-E or HG-U133A-B) in their chromosomal context and order on the human genome. 

[0361] For identification of larger chromosomal changes or aberrations, as they have been described in detail above, 



25 




50 



56 



EP 1 365 034 A2 



a sufficient number of genes, transcripts or DNA-fragments is needed. The density of probes covering a chromosomal 
region is not necessarily limited to the transcribed genes, in case of the use of array based CGH but by utilizing RNA 
as probe material the density is given by the distance of genes on a chromosome. The DNA-microarrays provided by 
Affymetrix Inc. Do contain hitherto all transcripts from the known humane genome, which are be represented by 40.000 
5 - 60.000 probe sets. By BLAST mapping and sorting the sequences of these short DNA-oligomers to the public available 
sequence of the human genome represented by the so called "golden path", available at the university of California in 
Santa Cruz or from the NCBI, a chromosomal display of the whole Transcriptome of a tissue specimen evolves. By 
graphical display of the individual chromosomal regions and color coding of over or under represented transcripts, 
compared to a reference transcriptome regions with DNA gains and losses can be identified. 

w 

b) Quantification of gene copy numbers by combined IHC and quantitative PCR (PCR karyotyping) or directly by 
quantitative PCR 

[0362] Usually one to three paraffin-embedded tissue sections that are 5 urn thick are used to obtain genomic DNA 
15 from the samples. Tissue section are stained by colorimetric IHC after deparaffinization to identify regions containing 
disease associated cells. Stained regions are macrodissected with a scalpel and transferred into a microcentrifuge 
tube. The genomic DNA of these isolated tissue sections is extracted using appropriate buffers. The isolated DNA is 
then used for quantitative PCR with appropriate primers and probes. Optionally the IHC staining can be omitted and 
the genomic DNA can be directly isolated with or without prior deparaffinization with appropriate buffers. Those who 
20 are skilled in the art may vary the conditions and buffers described below to obtain equivalent results. 

[0363] Reagents from DAKO (HercepTest Code No. K 5204) and TaKaRa were used (Biomedicals Cat.: 9091) ac- 
cording to the manufactures protocol. 

[0364] It is convenient to prepare the following reagents prior to staining: 
25 Solution No. 7 

[0365] Epitope Retrieval Solution (Citrate buffer + antimicrobial agent) (10xconc.) 20 ml ad 200 ml aqua dest. (stable 
for 1 month at 2-8° C ) 

30 Solution No. 8 

[0366] Washing-buffer (Tris-HCI + antimicrobial agent) (1 0 x cone.) 30 ml ad 300 ml destilled water (stable for 1 month 

at 2-8° C ) 

35 Staining solution: DAB 

[0367] 1 ml solution is sufficient for 10 slides. The solution were prepared immediately before usage.: 
[0368] 1 ml DAB buffer (Substrate Buffer solution, pH 7.5, containing H 2 0 2 , stabilizer, enhancers and an antimicrobial 
agent) + 1 drop (25-3 ul) DAB-Chromogen (3,3'-diaminobenzidine chromogen solution). This solution is stable for up 
40 to 5 days at 2-8°C. Precipitated substances do not influence the staining result. Additionally required are:2x approx. 
100 ml Xylol, 2x approx. 100 ml Ethanol 100%, 2x Ethanol 95%, aqua dest. These solution can be used for up to 40 
stainings. A water bath is required for the epitope retrieval step. 

Staining procedure: 

[0369] All reagents are pre-warmed to room temperature (20-25°C) prior to immuno-staining. Likewise all incubations 
were performed at room temperature. Except the epitope retrieval which is performed in at 95°C water bath. Between 
the steps excess of liquid is tapped off from the slides with lintless tissue (Kim Wipe). 

so Deparaffinization 

[0370] Slides are placed in a xylene bath and incubated for 5 minutes. The bath is changed and the step repeated 
once. Excess of liquid is tapped off and the slides are placed in absolute ethanol for 3 minutes. The bath is changed 
and the step repeated once. Excess of liquid is tapped off and the slides are placed in 95% ethanol for 3 minutes. The 
55 bath is changed and the step repeated once. Excess of liquid is tapped off and the slides are placed in distilled water 
for a minimum of 30 seconds. 
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Epitope Retrival 

[0371] Staining jars are filled with with diluted epitope retrieval solution and preheated in a water bath at 95°C. The 
deparaffinized sections are immersed into the preheated solution in the staining jars and incubated for 40 minutes at 
5 95°C. The entire jar is removed from the water bath and allowed to cool down at room temperature for 20 minutes. 
The epitope retrieval solution is decanted, the sections are rinsed in distilled water and finally soaked in wash buffer 
for 5 minutes. 

Peroxidase Blocking: 

w 

[0372] Excess of buffer is tapped off and the tissue section encircled with a DAKO pen. The specimen is covered 
with 3 drops (100 ul) Peroxidase-Blocking solution and incubated for 5 minutes. The slides are rinsed in distilled water 
and placed into a fresh washing buffer bath. 

15 Antibody Incubation 

[0373] Excess of liquid is tapped off and the specimen are covered with 3 drops (1 00 ul) of Anti-Her-2/neu reagent 
(Rabbit Anti-Human Her2 Protein in 0.05 mol/L Tris/HCI, 0.1 mol/L NaCI, 15 mmol/L pH7.2 NaN 3 containing stabilizing 
protein) or negative control reagent (= IGG fraction of normal rabbit serum at an equivalent protein concentration as 
20 the Her2 Ab). After 30 minutes of incubation the slide is rinsed in water and placed into a fresh water bath. 

Visualization 

[0374] Excess of liquid is tapped off and the specimen are covered with 3 drops (100 ul) of visualization reagent. 
25 After 30 minutes of incubation the slide is rinsed in water and placed into a fresh water bath. Excess of liquid is tapped 
off and the specimen are covered with 3 drops (100 ul) of Substrate-Chromogen solution (DAB) for 10 minutes. After 
rinsing the specimen with distilled water, photographs are taken with a conventional Olympus microscope to document 
the staining intensity and tumor regions within the specimen. Optionally a counterstain with hematoxylin was performed. 

30 DNA extraction 

[0375] The whole specimens or dissected subregions are transferred into a microcentrifuge tubes. Optionally a small 
amount (10ul) of preheated TaKaRa solution (DEXPAT™) is preheated and placed onto the specimen to facilitate 
sample transfer with a scalpel. 50 to 150 ul of TaKaRa solution were added to the samples depending on the size of 

35 the tissue sample selected. The sample are incubated at 100°C for 10 minutes in a block heater, followed by centrif- 
ugation at 12.000 rpm in a microcentrifuge. The supernatant is collected using a micropet and placed in a separate 
microcentrifuge tube. If no deparaffinization step has been undertaken one has to be sure not to withdraw tissue debris 
and resin. Genomic DNA left in the pellet can be collected by adding resin-free TaKaRa buffer and an additional heating 
and centrifugation step. Samples are stored at -20°C. 

40 [0376] Genomic DNA from different tumor cell lines (MCF-7, BT-20, BT-474, SKBR-3, AU-565, UACC-81 2, UACC- 
893, HCC-1008, HCC-2157, HCC-1954, HCC-2218, HCC-1937, HCC1599, SW480), or from lymphocytes is prepared 
with the QIAamp® DNA Mini Kits or the QIAamp® DNA Blood Mini Kits according to the manufacturers protocol. Usually 
between Ing up to 1jxg DNA is used per reaction. 

45 Quantitative PCR 

[0377] To measure the gene copy number of the genes within the patient samples the respective primer/probes (see 
table below) are prepared by mixing 25 ul of the 100 uM stock solution "Upper Primer", 25 ul of the 100 u.M stock 
solution "Lower Primer" with 12,5 ul of the 100 uM stock solution Taq Man Probe (Quencher Tamra) and adjusted to 

50 500 ul with aqua dest. For each reaction 1 ,25 ul DNA-Extract of the patient samples or 1 ,25 ul DNA from the cell lines 
were mixed with 8,75 ul nuclease-free water and added to one well of a 96 Well-Optical Reaction Plate (Applied Bio- 
systems Part No. 4306737). 1 ,5 ul Primer/Probe mix, 12, ul Taq Man Universal-PCR Mix (2x) (Applied Biosystems Part 
No. 4318157) and 1 ul Water are then added. The 96 well plates are closed with 8 Caps/Strips (Applied Biosystems 
Part Number 4323032) and centrifuged for 3 minutes. Measurements of the PCR reaction are done according to the 

55 instructions of the manufacturer with a TaqMan 7900 HT from Applied Biosystems (No. 20114) under appropriate 
conditions (2 min. 50°C, 10 min. 95°C, 0.15mm. 95°C, 1 min. 60°C; 40 cycles). SoftwareSDS 2.0 from Applied Bio- 
sysrtems is used according to the respective instructions. CT-values are then further analyzed with appropriate software 
(Microsoft Excel™). 
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Table 1 



5 


DNASEQID 
NO: 


Protein SEQ ID 
NO: 


Genbank ID 


Unigene_v133_ID 


Locus Link 
ID 


Gene Name 




1 


27 


NM_006 148.1 


75080 


3927 


LASP1 


10 


2 


28 


NM_000723.1 


635 


782 


CACNB1 


3 


29 


NM_000981.1 


252723 


6143 


RPL19RPL19 




4 


30 


Y13467 


15589 


5469 


PPARGBP 




5 


31 


NM_016507.1 


123073 




CrkRS 


15 


6 


32 


AB021742.1 


322431 


4761 


NEUROD2 




7 


33 


NM_006804.1 


77628 


10948 


MLN64 




8 


34 


NM_003673.1 


111110 


8557 


TELETHONIN 


20 


9 


35 


NM_002686.1 I 


1892 


5409 


PNMT 


10 


36 


X03363.1 


323910 


2064 


ERBB2 




11 


37 


AB008790.1 


86859 


2886 


GRB7 




12 


38 


NM_002809.1 


9736 


5709 


PSMD3 


25 


13 


39 


NM_000759.1 


2233 


1440 


GCSFG 




14 


40 


AI023317 


23106 


9862 


KIAA0130 




15 


41 


X55005 




7067 


c-erbA-1 


30 


16 


42 


X72631 


211606 


9572 


NRID1 




17 


43 


NM_007359.1 


83422 


22794 


MLN51 




18 


44 


U77949.1 


69563 


990 


CDC6 




19 


45 


U41 742.1 




5914 


RARA 


35 


20 


46 


NM_001 067.1 


156346 


7153 


TOP2A 




21 


47 


NM_001 552.1 


1516 




IGFBP4 




22 


48 


NM_001 838.1 


1652 




CCR7 EBI1 




23 


49 


NM_003079.1 


332848 


6605 


SMARCE1 
BAF57 




24 


50 


X14487 


99936 


3858 


KRT10 




25 


51 


NM_000223.1 


66739 




KRT12 


45 


26 


52 


NM_002279.2 


32950 


3884 


hHKa3-ll 




53 


76 


NM_005937 


349196 


4302 


MLLT6 




54 


77 


XM_008147 


184669 


7703 


ZNF144 




55 


78 


NM_1 38687 


432736 


8396 


PIP5K2B 




56 


79 


NM_020405 


125036 


57125 


TEM7 




57 


80 


XM_0 12694 


258579 


22806 


ZNFN1A3 




58 


81 


XM_085731 


13996 


147179 


WIRE 


55 


59 


82 


NM_002795 


82793 


5691 


PSMB3 




60 


83 


NM_033419 


91668 


93210 


MGC9753 
Variant a 
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Table 1 (continued) 





DNASEQID 


Protein SEQ ID 


Genbank ID 


Unigene_v133_ID 


Locus Link 


Gene Name 




NO: 


NO: 






ID 




5 


61 


84 








MGC9753 
Variant c 




62 


85 








MGC9753 
Variant d 


10 


63 


86 








MGC9753 
Variant e 




64 


87 








MGC9753 
Variant g 


15 












MGC9753 
Variant h 




66 


89 








MGC9753 














Variant i 


20 


67 


90 


AF395708 


374824 


94103 


ORMDL3 




68 


91 


NM_032875 


194498 


84961 


MGC15482 




69 


92 


NM_032192 


286192 


84152 


PPP1R1B 




70 


93 


NM_032339 


333526 


84299 


MGC14832 




71 


94 


NM_057555 


12101 


51242 


LOC51242 




72 


95 


NM_017748 


8928 


54883 


FLJ20291 




73 


96 


NM_018530 


19054 


55876 


Pro2521 




74 


97 


NM_016339 


118562 


51195 


Link-GEFII 




75 


98 


NMJJ32865 


294022 


84951 


CTEN 



Table 2 

35 . . 



DNASEQID NO: 


Gene description 


1 


Member of a subfamily of LIM proteins that contains a LIM domain and an SH3 (Src homology 
region 3) domain 


2 


Beta 1 subunit of a voltage-dependent calcium channel (dihydropyridine receptor), involved in 
coupling of excitation and contraction in muscle, also acts as a calcium channel in various 
other tissues 


3 


Ribosomal protein L19, component of the large 60S ribosomal subunit 


4 


Protein with similarity to nuclear receptor-interacting proteins; binds and co-activates the 
nuclear receptors PPARalpha (PPARA), RARalpha (RARA), RXR, TRbetal, and VDR 


5 


we26e0 CDC2-related protein kinase 7 


6 


Neurogenic differentiation, a basic-helix-loop-helix transcription factor that mediates neuronal 
differentiation 


7 


Protein that is overexpressed in malignant tissues, contains a putative trans-membrane region 
and a StAR Homology Domain (SHD), may function in steroidogenesis and contribute to tumor 
progression 


8 


Telethonin, a sarcomeric protein specifically expressed in skeletal and heart muscle, caps titin 
(TTN) and is important for structural integrity of the sarcomere 
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Table 2 (continued) 





DNASEQID NO: 


Gene description 


5 


9 


Phenylethanolamine N-methyltransferase, acts in catecholamine biosynthesis to convert 
norepinephrine to epinephrine 




10 


Tyrosine kinase receptor that has similarity to the EGF receptor, a critical component of IL-6 
signaling through the MAP kinase pathway, overexpression associated with prostate, ovary 


10 


11 


Growth factor receptor-bound protein, an SH2 domain-containing protein that has isoforms 
which may have a role in cell invasion and metastatic progression of esophageal carcinomas 




12 


Non-ATPase subunit of the 26S proteasome (prosome, macropain) 


15 


13 


Granulocyte colony stimulating factor, a glycoprotein that regulates growth, differentiation, and 
survival of neutrophilic granulocytes 




14 


Member of the Vitamin D Receptor Interacting Protein co-activator complex, has strong 
similarity to thyroid hormone receptor-associated protein (murine Trap 100) which function as 
a transcriptional coregulator 


20 


15 


Thyroid hormone receptor alpha, a high affinity receptor for thyroid hormone that activates 
transcription; homologous to avian erythroblastic leukemia virus oncogene 




16 


encoding Rev-ErbAalp nuclear receptor subfamily 1, group D, member 1 




17 


Protein that is overexpressed in breast carcinomas 


25 


18 


Protein which interacts with the DNA replication proteins PCNA and Orel, translocates from 
the nucleus following onset of S phase; S. cerevisiae homolog Cdc6p is required for initiation 
of S phase 


30 


19 


Retinoic acid receptor alpha, binds retinoic acid and stimulates transcription in a ligand- 
dependent manner 




20 


DNA topoisomerase II alpha, member of a family of proteins that relieves torsional stress 
created by DNA replication, transcription, and cell division; 




21 


Insulin-like growth factor binding protein, the major IGFBP of osteoblast-like cells, binds IGFI 
and IGF2 and inhibits their effects on promoting DNA and glycogen synthesis in osteoblastic 
cells 




22 


HUMEBI103 G protein-coupled receptor (EBI 1) gene exon 3 chemokine (C-C motif) receptor 
7 G protein-coupled receptor 


40 


23 


Protein with an HMG 1/2 DNA-binding domain that is subunit of the SNF/SWI complex 
associated with the nuclear matrix and implicated in regulation of transcription by affecting 
chromatin structure 


45 


24 


Keratin 10, a type I keratin that is a component of intermediate filaments and is expressed in 
terminally differentiated epidermal cells; mutation of the corresponding gene causes 
epidermolytic hyperkeratosis 




25 


Keratin 1 2, a component of intermediate filaments in corneal epithelial cells; mutation of the 
corresponding gene causes Meesmann corneal dystrophy 






Hair keratin 3B, a type I keratin that is a member of a family of structural proteins that form 
intermediate filaments 




53 


MLLT6 Myeloid/lymphoid or mixed-lineage leukemia (trithorax homolog, Drosophila); 
translocated to, 6 




54 


zinc finger protein 144 (Mel-18) 


55 


55 


phosphatidylinositol-4-phosphate 5-kinase type II beta isoform a 




56 


tumor endothelial marker 7 precursor 
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Table 2 (continued) 



DNASEQID NO: 


Gene description 


57 


zinc finger protein, subfamily 1A, 3 


58 


WASP-binding protein putative cr16 and wip like protein similar to Wiskott-Aldrich syndrome 
protein 


59 


proteasome (prosome, macropain) subunit, beta type, 3 


60 


Predicted 


67 


ORM1-like 3 (S. cerevisiae) 


68 


F-box domain A Receptor for Ubiquitination Targets 


69 


protein phosphatase 1, regulatory (inhibitor) subunit 1B (dopamine and cAMP regulated 
phosphoprotein, DARPP-32) 


70 


Predicted Protein 


71 


Predicted Protein 


72 


Predicted Protein 


73 


Predicted Protein 


74 


Link-GEFII: Link guanine nucleotide exchange factor II 


75 


C-terminal tensin-like 



25 



35 



50 



66 
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Subcellular localization 




Plasma membrane I 


Cytoplasm 


Nucleus | 






Cytoplasm | 


Cytoplasm | 




Plasma membrane | 


i 

! 


Cytoplasm I 


Extracellular space | 




Nucleus 


i Nucleus | 




! nucleus I 


! nucleus ! 


I nucleus I 




plasma membrane | 


nucleus nuclear 
chromosome 


cytoplasm 


I cytoplasm I 






Gene function 


SH3/SH2 adapter protein f 


voltage-gated calcium channel membrane fraction Channel [passive transporter] 


RNA binding structural protein of ribosome protein biosynthesis j 


transcription co-activator nucleus Pol II transcription ! 




transcription factor transcription regulation from Pol II promoter neurogenesis i 


mitochondrial transport steroid and lipid metabolism I 


structural protein of muscle sarcomere alignment 


phenylethanolamine N-methyltransferase Transferase 


Neu/ErbB-2 receptor receptor signaling protein tyrosine kinase | 


SH3/SH2 adapter protein EGF receptor signaling pathway ! 


26S proteasome Protein degradation Proteasome subunit j 


developmental processes positive control of cell proliferation | 


fatty acid omega-hydroxylase fatty acid omega-hydroxylase 


DNA-binding protein Transcription factor ! 


Isteroid hormone receptor transcription co-repressor j 




Inucleotide binding cell cycle regulator DNA replication checkpoint regulation of CDK activity i 


retinoic acid receptor transcription co-activator transcription factor 


|DNA binding DNA topoisomerase (ATP-hydrolyzing) 


skeletal development DNA metabolism signal transduction cell proliferation 




chromatin binding transcription co-activator nucleosome disassembly transcription 


| Cell structure Cytoskeletal Epidermal Development and Maintenance 


Istructural protein vision cell shape and cell size control intermediate filament 


|cell shape and cell size control Cell structure 


lleucine-zipper containing fusion 


DNA 
|SE0 ID NO: 












o 






ON 


o 


















ON 


© 


CN 


cs 

CN 


CN 




<N 


CN 
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Is 



1 1 
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Table 4 



5 


DNA 
SEQ ID NO: 


Protein 
SEQ ID NO: 


Gene Name 


DBSN PID 


Type 


Codon 


AA-Seq 




9 


34 


ERBB2 


rs2230698 


coding-synon 


TCA|TCG 


S|S 




9 


34 


ERBB2 


rs2230700 


noncoding 










34 


ERBB2 


rs1 058808 


coding-nonsynon 


CCC|GCC 


P|A 


10 


' - 


34 


ERBB2 


rs1801200 


noncoding 








9 


34 


ERBB2 


rs903506 


noncoding 










34 


ERBB2 


rs2313170 


noncoding 






15 


' g 


34 


ERBB2 


rs1 136201 


coding-nonsynon 


ATC|GTC 


l|V 




9 


34 


ERBB2 


rs2934968 


noncoding 










34 


ERBB2 


rs2 172826 


noncoding 








9 


34 


ERBB2 


rs1810132 


coding-nonsynon 


ATC|GTC 


l|V 


20 


9 


34 


ERBB2 


rs1801201 


noncoding 








14 


39 


c-erbA-1 


rs2230702 


coding-synon 


TCC|TCT 


S|S 




14 


39 




rs2230701 


coding-synon 


GCC|GCT 


A|A 


25 


14 


39 


c-erbA-1 


rs1 126503 


coding-nonsynon 


ACC|AGC 


T|S 


14 


39 


c-erbA-1 


rs3471 


noncoding 








19 


44 


TOP2A 


rs 13695 


noncoding 








19 


44 


TOP2A 


rs471692 


noncoding 






30 


19 


44 


TOP2A 


rs558068 


noncoding 








19 


44 


TOP2A 


rs1 064288 


noncoding 








19 


44 


TOP2A 


rs1061692 


coding-synon 


GGA|GGG 


G|G 


35 




44 


TOP2A 


rs520630 


noncoding 








19 


44 


TOP2A 


rs782774 


coding-nonsynon 


AAT|ATT|ATT|TTT 


N|I|I|F 




19 


44 


TOP2A 


rs565121 


noncoding 










44 


TOP2A 


rs2586112 


noncoding 






40 


19 


44 


TOP2A 


rs532299 


coding-nonsynon 


TTT|GTT 


F|V 




19 


44 


TOP2A 


rs2732786 


noncoding 










44 


TOP2A 


rs1 804539 


noncoding 






45 


19 


44 


TOP2A 


rs1 804538 


noncoding 








19 


44 


TOP2A 


rs1 804537 


noncoding 










44 


TOP2A 


rs1141 364 


coding-synon 


AAA|AAG 


K|K 




23 


48 


KRT10 


rs12231 


noncoding 






50 


23 


48 


KRT10 


rs 11 32259 


coding-nonsynon 


CAT|CGT 


H|R 




23 


48 


KRT10 


rs 11 32257 


coding-synon 


CTG|TTG 


L|L 




23 


48 


KRT10 


rs1 132256 


coding-synon 


GCC|GCT 


A|A 


55 


23 


48 


KRT10 


rs1 132255 


coding-synon 


CTG|TTG 


L|L 




23 


48 


KRT10 


rs1 132254 


coding-synon 


GGC|GGT 


G|G 




23 


48 


KRT10 


rs1 132252 


coding-synon 


TTC|TTT 


F|F 
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Table 4 (continued) 



DNA 
SEQIDNO: 


Protein 
SEQIDNO: 


Gene Name 


dbsnPID 


Type 


Codon 


AA-Seq 


23 


48 


KRT10 


rs1 132268 


coding-nonsynon 


CAG|GAG 


Q|E 


23 


48 


KRT10 


rs1 132258 


coding-nonsynon 


CGG|TGG 


R|W 
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71 
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72 
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3'TAMRA | 


CO 




3'TAMRA | 






3'TAMRA 1 






GC 3TAMRA | 






3'TAMRA 1 




in 


3'TAMRA | 


in 




3'TAMRA 1 






3'TAMRA 1 


in 


in 


3'TAMRA 1 






3'TAMRA | 






CT 












GT 


TC 




TATTCA 


















CT 










CJ 














CAGACAG' 






CTCCCGA 






TATTCAT 


TCAAAGA 


GCAGAGA 


TCCTATG 






AG 








Eh 


CJ 


AGGCAGT 


cd 


CCA 




U 


U 


CJ 
Eh 
Eh 
CD 

EH 


AGTCT 


CT 


jAACCCA 


EQUENi 


O 

Eh 
CD 
O 
-=C 


u 
o 
< 
cd 
< 
cd 


TTTCA 


Eh 

rtj 
O 
O 
<C 


AAACCT' 


GAAAC 


ATACC 


CD 

EH 

u 


3 TAGAAT 


iATCCAT 


,GATTGC 


'GAAGTG 


'TTTTAG 


'GGTGCC 


EH 

CJ 
CJ 


^TATGGT 


XGAACA 


Eh 
CJ 
< 
CD 
Eh 


CJ 
Eh 
Eh 


VTCCCAA 


\TAGAAT 


CJ 
Eh 
Eh 
CJ 


< 

CD 
CD 


CGCTTAT 


rGATTCC 


\CCCCAG 


CD 

EH 

CD 
CJ 

y 


^CCCTTG 




O 
CD 

cd 

Eh 


cd 
sC 

Eh 
< 
CD 
r< 
CD 
<H 


CTGAGCCI 


CTGGTCCC 


TCAATTCC 


3GTGTTCC 


CJ 

Eh 

cj 
o 
cd 
fiC 
cd 


o 

Eh 

O 
Eh 


CTTAATGG 


cactgag; 


CD 

< 
o 

Eh 
CJ 
Eh 
CD 


GCCCTGCI 


TCATTGG 1 ] 


CD 
U 
U 


CCACATCC 


GCATTCGC 


ACCTGCGC 


GGTGACTC 


GGACTCAC 


ACTTTCCi 


CACAGGA2 


CCAGTCTC 


CCCAGCTC 


ATGAGCG' 


TACTTAT' 


GGAAGCAi 


TAGAGCTi 


CTTTGTAi 


CD 
CD 




5 ' AGCGAT< 


FAM 5' CCTGCC. 


5'ATCCCO 


5'CAGCCT' 


< 
cj 

cd 

Eh 

PL, 


5' CCCCAT 


cd 

s 

o 
cd 

EH 


FAM 5'CAGCCT 


5' CCTGAA 


5' TATTAA 


FAM 5' TATCTG 


5' CACAGA 


5' GCGAGG 


FAM 5' CTGTGA 


5' CCGTCT 


5' ACCCAT 


FAM 5' AGTGGC 


5' CCCCAT 


5' CCAGAG 


FAM 5' CCAGAA 


5' CTGCCC 


H 


CJ 
CD 
Eh 

u 

CD 
< 


5' TCCCTG 


5' TCTCAG 


FAM 5' TCCAGT 


5' CACTTC 


s 

EH 

CJ 
CJ 
CJ 


FAM 5' CAGCGT 


PRIMER | 


MLLT6 REV | 


ZNF144 | 


ZNF144 FOR j 


ZNF144 REV | 


PIP5K2B f 


PIP5K2B FOR | 


PIP5K2B REV i 


TEM7 i 


TEM7 FOR i 


|TEM7 REV | 


ZNFN1A3 | 


ZNFN1A3 FOR I 


[ZNFN1A3 REV ' 


|WIRE 


IWIRE FOR 


|WIRE REV 


|PSMB3 


|PSMB3 FOR 


|PSMB3 REV 


IMGC9753 


IMGC9753 FOR 


|MGC9753 REV 


IORMDL3 


|ORMDL3 FOR 


[ORMDL3 REV 


[MGC15482 


|MGC 15482 FOR 


|MGC 15482 REV 


1PPP1R1B 



50 



55 
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GB ID I 


Z24029I 




G05498 X53777 


Z51080 


Z52895 


L29873 


G07286Z39013 


L29870 




G15195 




Z52854 


G07073 X03438 




i Gil 172X55068 


! G14779 T50487 


G11580T50487 


! Z51301 


1 Z52160 


Z53182 


Z52130 


G15440 


G11900 


G13994 


X60690 


Z53675 




PCR size (bp) | 


128-142I 


CS 
CS 


171-318I 


71-103I 


119-151 


NO 


151-1521 


NO 


O 

cs 






150-1661 


102-103 








NO 




156-168 


I 167-173 


! 239-251 


r> cn 


© 




o 
o 


237 - 253 




reverse | 


TGCCGTGCCAGAGAGA j 


GCCCAGCCTGTCACTTATTC i 


CAGCATTGGATGCAATCC | 


AGGACAGTGTGTAGCCCTTC i 


TGCCTACTGGAAACCAGA | 


NGGAGGTTGCAGTGAGCCAAGAT 


TTGTTTCCCTTTGACTTTCTGA | 


GTCTGGGTCTTTATGGNGCTTGTG 


GCAT AC AGC AC CCTCTACCT i 


CAGGAGTGAGACACTCTCCATG | 


GCGTGTCTGTCTCCATGTGTGC 


CTGGAGGTTGGCTTGTGGAT 1 


< 

CJ 
CJ 

< 

o 

CJ 

< 
u 

CJ 

s 

cj 


TTGTCACCCCATTGCCTTTC \ 


CTTGTCCCTTCTCAATCCTCC | 


GATTACAGTGCTCCCTCTCCC 


GATTACAGTGCTCCCTCTCCC 


TTTAGACTTGGTAACTGCCG 


|GCCATGATCTCCCAAAGCC 


H 
U 
< 
Eh 
< 

CD 
CD 

O 
Eh 
CD 
H 
O 
Eh 
CD 
H 
CD 
H 


ITACATGAAGGCATGGTCTG 


CCCACGGCTTTCTTGATCTA 


CD 
Eh 
CD 
CD 

Eh 
CJ 
Eh 

1 

H 

CJ 

% 

CJ 
Eh 
O 
< 
CJ 


[ATTCAGCCTCAGTTCACTGCTTC 


TAGTCTCTGGGACACCCAGA 


|ACTGACTGCGCCACTGC 


|GAACAGGGTTAGTCCATTCG 


rward 


GCAGAAAAATCCT [ 


o 
o 
o 

U 
< 

cd 

t=£ 


cd 
u 


GACCATGA [ 


CAGAAATGTGA | 


CTTTCAAAGCT | 


ATGCTCAAACC [ 


TCATTTTCTTCAG j 


TTCCATGG | 


CJ 
CJ 

CJ 
H 

u 

Eh 


CTCAGCCTGC i 


GTGATG | 


TTGCTG j 


CCTTGTGA ! 


TCAGCTCCC | 


TTAATTAAGCTGC | 


GCTGCATGGC ! 


AGGCTTTG 


AGAACTTTTCAG 


TGAGCC 


CCAATGA 


CACTGGCTCC 


CCCATGAGG 


ATAGTAATTATCC 


ACTCAGAG 


CTACCCTGTTGAG 


AAATCACC 


<2 


ACAGTCTATCAA 


GACAACAGAGCG. 


TGGTCATTCGAC. 


fiC « 
O « 
O 

H C 
O ri 

a « 

H C 

o c 


CATAGGTATGTT 


AAGGGGAAGGGG 


CAAAAGCTTATG 


TAGGTTCACCTC 


CGGACCAGAGTG 


AGGGGAGAATAA 


TGGATTCACTGA 


TCCCCAATGACG 


GGTCCCACGAAT 


TCGATCTCCTGA 


.CCTTGGATAGAT 


H 

Eh 

CD 

1 

U 
CD 

% 
Eh 
Eh 


GGTTTTAATTAA 


AGTTTGACACTG 


TGCAGATGCCTA 


TCGAGGTTATGG 


GCTGATCTGAAG 


GATAAAAACAAG 


TGTAATGTAAGC 


GTGAGTTCAAGC 


(GATCCAGTGGAG 


|ATTCCTGAGTGT 


| T C GAGAAGGAC A 


Q 


D17S946I 


D17S1181 


D17S2026 


D17S838 


D17S1818| 


D17S614| 


D17S2019| 


D17S608 


D17S1655 


D17S2147 


D17S754 


D17S1814 


D17S2007 


D17S1246 


D17S1979 


D17S1984 


D17S1984 


D17S1867 


D17S1788 


D17S1836I 


D17S1787 


D17S2154 


D17S1955 


| D17S2098 


D17S518 


|D17S1851 


| D11S4358 






cs 






•) NO 




oo 


On 


o 












NO 




00 


ON 




CS 




q cs 


cs 


NO 

CS 


cs 


00 

cs 


ON 

cs 
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OS 
0© 




s 


Os 


5 


Os 


Os 




L36685 


o 

OS 


© 


GB ID 


1 


G14f 




G06f 


OS 

<n 


G071 




SO 

O 


G13< 


1 L181 


00 

E 


'S 

& 

a 




S 




Os 


© c 


o 












2 - 120 


PCR sL 


























reverse i 


AGTCAGCTGAGATTGTGCC 


CCCCACACACAGCTCATATG 


GCAACAGAGGGAGACTCCAA 


TATGGAGTACCTACTCTATGCCAGG 


AGCTGTGACAAATGCCTGTA ; 


AAACACCACACTCTCCCCTG 


GGGGCAGACGACTTCTCCTT 


EH 

Eh 
Eh 

u 

Eh 
Eh 
U 
Eh 

cd 
cd 
< 

CD 
H 

% 
H 

u 
o 
-=c 

CJ 
Eh 
Eh 


TCCGCAT CCT T T T T AAGAGGCAC 


TGACGTGCTATTTCCTGTTTTGTCT 


< 

CD 
CD 
< 
CJ 
CJ 
CJ 
CJ 

u 
o 


GAGTCCGCTACCTGAGTGCT 


forward i 


GTTCTTTCCTCTTGTGGGG | 


CAAGCCAAGACATCCCAGTT ; 


TTTTCTCTCTCATTCCATTGGG 


!TCCCATCCCGTAAGACCTC | 


H C 
H C 
Eh rt 
U C 
O rt 
O C 

ScT 

CD < 
^ <- 

< C 

o c. 

Eh rf 
Eh C, 

< E- 


TCACTGTCCTCCAAGCCAG j 


TTCTTGGGCTTCCCGTAGCC [ 


| GGGGATACAACCTTTAAAGTTCC j 


GCTGAAATAGCCATCTTGAGCTAC j 


CTTTCACTCTTTCAGCTGAAGAGG j 


GTTTGTTGCTATGCCTGC i 


;ACTCCTCATCTGTAGGGTCT 


a 


)17S964 


19S1091 


17S1179 


10S2160 


17S1230I 


I7S2011 


L7S1237 


I7S2038 


17S2091 


M7S649 


L7S1190 


M87506I 




t— i 


Q 


Q 


Q 


Q C 


5 Q 


Q 


Q 


Q 




Q 




l 


© 










i SO 




oo 


Os 






9 
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<110> Bayer AG 



<120> METHODS AND COMPOSITIONS FOR THE PREDICTION, DIAGNOSIS, PROGNOSIS, 
PREVENTION AND TREATMENT OF MALIGNANT NEOPLASIA 



<130> LeA 36108.1 EP 

<150> EP02010291.9 

<151> 2002-05-21 

<160> 314 

<170> Patentln version 3.1 

<210> 1 

<211> 3846 

<212> DNA 

<213> Homo 



120 
180 
240 



420 
480 



<400> 1 

gcctoccgco agctcgcctc ggggaacagg acgcgcgtga gotcaggcgt ccccgccoca 
gcttttctog gaacoatgaa coocaactgc gcccggtgog goaagatcgt gtatcooacg 
gagaaggtga actgtctgga taagttctgg cataaagcat gcttccattg cgagaootge 
aagatgacac tgaacatgaa gaactacaag ggctacgaga agaagcccta otgcaacgca 
oactacccca agcagtcctt oaccatggtg goggaoaccc cggaaaaoct tcgcctoaag 
caacagagtg agctccagag toaggtgcgc tacaaggagg agtttgagaa gaacaagggc 
aaaggtttea gogtagtggc agacacgccc gagctccaga gaatcaagaa gacccaggac 
cagatcagta atataaaata ocatgaggag tttgagaaga gccgcatggg ccctagoggg 
ggcgagggca tggagccaga gcgtcgggat toacaggacg goagcagcta ccggcggccc 
otggagcagc agcagcctca ccacatcccg aocagtgccc cggtttaoca gcagccccag 
cagcagccgg tggcccagtc ctatggtggc tacaaggagc ctgcagcccc agtotccata 
cagcgcagcg ooocaggtgg tggcgggaag cggtaccgcg cggtgtatga ctacagcgco 
gccgacgagg acgaggtctc cttccaggac ggggacacca tcgtcaacgt gcagcagatc 
gacgacggct ggatgtacgg gacggtggag cgcaccggcg acacggggat gctgccggoc 
aaotacgtgg aggccatctg aacccggagc goccccatct gtcttcagca cattccacgg 
catcgcatcc gtootgggcg tgagccgtcc attcttcagt gtctctgttt tttaaaacot 
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1020 



gcgacagctt gtgattccta cccotcttcc agcttctttt gccaactgaa gccttcttct 
gccacttctg ogggctccct cctotggcag gcttcccccg tgatogactt cttggttttc 1080 
tctotggatg gaaogggtat gggeotctet gggggaggca gggctggaat gggagaoctg 1140 
ttggcctgtg ggcotcacct gcccctctgt tctctcccct oacatcctcc tgcccagotc 1200 
otcacatacc cacacattcc agggctgggg tgagcctgac tgcoaggacc ocaggtoagg 
ggotccotac attocccaga gtgggatoca cttottggtt cctgggatgg cgatggggac 
tctgccgotg tgtagggacc agtgggatgg gctetaectc tctttctoaa agagggggct 
ctgcccaoot ggggtctctc tccctacctc cctcctcagg ggcaacaaoa ggagaatggg 1440 
gttoctgotg tggggcgaat tcatcccetc cccgcgcgtt ccttcgcaca ctgtgatttt 1500 
gccetcctgc ocacgcagac ctgcagcggg caaagagctc ocgaggaagc acagcttggg 1560 
tcaggttott gcctttctta attttaggga aagotaccgg aaggagggga acaaggagtt 
ctcttccgoa gcccotttcc ccacgcccac occeagtctc cagggaccct tgcctgcotc 
ctaggctgga agceatggtc cogaagtgta gggcaagggt gcctcaggac cttttggtot 
teagoctccc tcagccccca ggatctgggt taggtggccg ctcotccotg otcotcatgg 
gaagatgtct cagagccttc catgaectcc cctccecagc ccaatgecaa gtggacttgg 
agctgcacaa agtcageagg gaeoaetaaa tetccaagac ctggtgtgcg gaggcaggag 
catgtatgtc tgcaggtgtc tgacaegcaa gtgtgtgagt gtgagtgtga gagatggggc 
gggggtgtgt ctgtaggtgt ctctgggoct gtgtgtgggt ggggttatgt gagggtatga 
agagctgtot tcccctgaga gtttcctoag aacccacagt gagaggggag ggctcctggg 
gcagagaagt tccttaggtt ttctttggaa tgaaattoct cettcccccc atctctgagt 
ggaggaagcc oaccaatctg occtttgcag tgtgtcaggg tggaaggtaa gaggttggtg 
tggagttggg gctgecatag ggtetgcage ctgotggggc taagcggtgg aggaaggcto 
tgtcactoca ggcatatgtt tccccatctc tgtctggggc tacagaatag ggtggoagaa 
gtgtoaccct gtgggtgtct ccctcggggg otcttccoct agacctccce ctcaottaca 
taaagctccc ttgaagcaag aaagagggtc ocagggetgc aaaactggaa gcacagcctc 
ggggatgggg agggaaagac ggtgctatat ocagttcctg ctctctgctc atgggtggct 
gtgacaacec tggootcaot tgattcatct ctggttttct tgccaccotc tgggagtccc 
catcocattt tcatectgag cccaaccagg ocotgccatt ggcotcttgt ccettggcac 
acttgtaccc acaggtgagg ggcaggacct gaaggtattg gcotgttcaa eaatcagtoa 
tcatgggtgt ttttgtcaac tgcttgttaa ttgatttggg gatgtttgco oogaatgaga 
ggttgaggaa aagactgtgg gtggggaggc octgcctgac ocatcoottt tcctttctgg 
ccccagccta ggtggaggca agtggaatat cttatattgg gcgatttggg ggctcgggga 
ggcagagaat ctcttgggag tcttgggtgg cgctggtgca ttctgtttcc tcttgatetc 
aaagcacaat gtggatttgg ggaccaaagg tcagggacac atccccttag aggacctgag 
tttgggagag tggtgagtgg aagggaggag oagoaagaag cagcotgttt tcactcagct 
taattotcct tcccagataa ggcaagceag tcatggaatc ttgctgoagg ccctocctot 3120 
actcttcctg tcctaaaaat aggggccgtt ttcttacaca ccoccagaga gaggagggac 3180 
tgtcacactg gtgctgagtg accgggggct gctgggcgtc tgttctttac oaaaaooatc 3240 
catccotaga agagcacaga gocctgaggg gctgggotgg gotgggctga gcccctggtc 
ttctotacag ttcaoagagg tctttcagct catttaatcc oaggaaagag gcatoaaagc 
tagaatgtga atataaottt tgtgggcoaa tactaagaat aacaagaago ccagtggtga 
ggaaagtgcg ttctcccagc actgcctcct gttttctccc tctoatgtcc ctccagggaa 
aatgacttta ttgcttaatt tctgcctttc ccccctcaca oatgcaottt tgggcctttt 
tttatagctg gaaaaaaoaa aataccaccc tacaaacotg tatttaaaaa gaaaoagaaa 
tgaccacgtg aaatttgcct otgtccaaac atttoatocg tgtgtatgtg tatgtgtgtg 
agtgtgtgaa gccgocagtt oatottttta tatggggttg ttgtctcatt ttggtctgtt 
ttggtcccct ccctogtggg cttgtgctcg ggatcaaacc tttctggcct gttatgattc 
tgaaoatttg acttgaacca caagtgaatc tttctcctgg tgactcaaat aaaagtataa 
ttttta 



<210> 2 

<211> 1711 

<212> DNA 

<213> Homo sapiens 



1260 
1320 
1380 



1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 



3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3846 
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<400> 2 

gagggaaggc aggaaggagg cagccgaagg 
tgcgcgogct gctttoggot cocaoggcot 
gocggoggcg ggaggggagg ctcctctcca 



ccgagctggg tggctggaoc gggtgotggc 
ctcccatgog otgagggagc ccggctgcgg 
tggtccagaa gaccagcatg tcccggggcc 



cttaeocacc ctcccaggag atccccatgg aggtcttcga ccccagcccg oagggcaaat 
aeagcaagag gaaagggcga ttcaaacggt cagatgggag oacgtcctcg gataccacat 
ccaacagctt tgtccgccag ggctoagcgg agtcetacac cagcegtcca tcagactctg 
atgtatotct ggaggaggac cgggaagect taaggaagga agcagagcgc oaggoattag 
cgcagotega gaaggocaag accaagccag tggcatttgc tgtgcggaca aatgttggct 
aoaatccgtc tccaggggat gaggtgcctg tgcagggagt ggceatcacc ttcgagccca 
aagacttcct goacatcaag gagaaataca ataatgactg gtggateggg cggctggtga 
aggagggctg tgaggttggc ttcattccca gceccgtcaa actggacagc cttcgcctgc 
tgcaggaaca gaagctgcgc eagaaecgcc tcggetccag caaatcaggc gataactcca 
gttceagtot gggagatgtg gtgactggca cccgccgccc oacaccccot gccagtgcca 
aacagaagca gaagtcgaca gagcatgtgc occcctatga ogtggtgoot tccatgaggc 
ccatcatcct ggtgggaccg tcgcteaagg gctaogaggt tacagacatg atgcagaaag 
ctttatttga cttcttgaag eatcggtttg atggcaggat ctccatcact cgtgtgacgg 
oagatattte cotggctaag cgcteagttc toaaoaaccc cagcaaacac atcatcattg 
agcgetccaa eacacgctcc agcctggctg aggtgcagag tgaaatcgag ogaatottcg 
agctggcccg gacccttcag ttggtcgotc tggatgctga caccatcaat cacccageec 
agctgtccaa gacctogetg gcccccatca ttgtttaeat eaagatcacc tctcccaagg 
tacttcaaag gotcatcaag teoegaggaa agtetcagtc oaaaoacctc aatgtccaaa 
tagcggoctc ggaaaagctg gcaeagtgcc cccctgaaat gtttgacatc atcctggatg 
agaaecaatt ggaggatgcc tgcgagcatc tggcggagta ottggaagcc tattggaagg 
ccacacaccc gcccagcagc acgccaccca atecgctgot gaaccgcacc atggctaccg 
cagccctgcg ccgtagccct gcccctgtot ceaacctcca ggtacaggtg ctcacctcgc 
tcaggagaaa cctoggcttc tggggcgggc tggagtcctc acagcggggc agtgtggtgc 
cacaggagca ggaacatgcc atgtagtggg cgccctgccc gtcttccctc ctgctctggg 
gtcggaactg gagtgcaggg aacatggagg aggaagggaa gagctttatt ttgtaaaaaa 
ataagatgag cggcaaaaaa aaaaaaaaaa a 

<210> 3 

<211> 698 

<212> DNA 

<213> Homo sapiens 



<400> 3 

ttttcotttc gctgctgcgg ccgcagccat 
ctctagtgtc ctcogctgtg gcaagaagaa 
aatcgccaat gocaaotccc gtcagcagat 
ccgcaagcct gtgacggtcc attcccgggc 
gaagggcagg cacatgggca taggtaagcg 
gaaggtcaca tggatgagga gaatgaggat 
atotaagaag atcgatcgcc aoatgtatca 
gttcaaaaac aagcggattc tcatggaaoa 
caagaagctc ctggctgaec aggctgaggc 
gcgccgtgaa gagogcctcc aggccaagaa 
ggaagagacc aagaaataaa acctcccact 
atagatcagc cattaaaata aaacaagcct 



gagtatgctc aggcttcaga agaggotcgc 
ggtotggtta gaococaatg agacoaatga 
coggaagctc atcaaagatg ggctgatcat 
tcgatgccgg aaaaacacct tggcccgccg 
gaagggtaca gcoaatgccc gaatgccaga 
tttgcgccgg ctgctcagaa gatacogtga 
cagcctgtac ctgaaggtga aggggaatgt 
catocacaag ctgaaggcag acaaggcccg 
ccgcaggtct aagaccaagg aagcaogcaa 
ggaggagatc atcaagactt tatccaagga 
ttgtctgtac atactggcct ctgtgattac 
taatctgc 
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<210> 4 

<211> 5810 

<212> DNA 

<213> Homo sapiens 



<400> 4 

gggaagatgg cggcggcctc gagcaccctc ctcttcttgc cgccggggac ttcagattga 
tccttcccgg gaagagtagg gaotgetggt gccotgcgtc ccgggatcec gagcoaactt 
gtttcctccg ttagtggtgg ggaagggctt atccttttgt ggcggatcta gottctcctc 
gccttcagga tgaaagctca ggggggaaac egaggagtoa gaaaagctga gtaagatgag 
ttctctcctg gaaoggctca atgcaaaatt taaccaaaat agaccctgga gtgaaacoat 
taagcttgtg cgtoaagtca tggagaagag ggttgtgatg agttctggag ggcatoaaca 
tttggtcagc tgtttggaga cattgcagaa ggctctcaaa gtaaeatctt taccagcaat 
gactgatcgt ttggagtcca tagcaggaca gaatggactg ggctotcatc toagtgccag 
tggcaotgaa tgttaoatca cgtcagatat gttctatgtg gaagtgcagt tagatcctgc 
aggacagott tgtgatgtaa aagtggctca ccatggggag aatcctgtga gctgtccgga 
gcttgtacag cagctaaggg aaaaaaattc tgatgaattt tctaagcacc ttaagggcct 
tgttaatctg tataaccttc caggggacaa caaactgaag actaaaatgt acttggotct 
ccaatcctta gaacaagatc tttetaaaat ggeaattatg tactggaaag caactaatgc 
tggteocttg gataagattc ttcatggaag tgttggctat ctcacaccaa ggagtggggg 
toatttaatg aacotgaagt actatgtctc tcettotgac otaotggatg aoaagactgo 
atctcccatc attttgcatg agaataatgt ttctcgatot ttgggcatga atgcatcagt 960 
gacaattgaa ggaaoatctg ctgtgtacaa actcccaatt gcaocattaa ttatggggtc 1020 
acatocagtt gacaataaat ggaccccttc cttctcctca atcaccagtg ccaacagtgt 1080 
tgatcttcct gcotgtttct tcttgaaatt tccccagoca atcccagtat ctagagoatt 
tgttcagaaa otgcagaact gcacaggaat tccattgttt gaaactcaac caacttatgc 
acccctgtat gaactgatca ctcagtttga gctatoaaag gaocotgaoc coataccttt 
gaatcacaac atgagatttt atgctgctct tcctggtcag cagcactgct atttcctcaa 
caaggatgot ootcttccag atggccgaag tctacaggga acccttgtta goaaaatcac 
otttoagcac cctggccgag ttcctottat cctaaatctg atcagaoacc aagtggccta 
taacaccctc attggaagct gtgtcaaaag aactattctg aaagaagatt ctootgggct 
tctccaattt gaagtgtgte ctctctoaga gtctogttto agcgtatctt ttcagcaoco 
tgtgaatgac tocctggtgt gtgtggtaat ggatgtgcag ggcttaacac atgtgagctg 1620 
taaactctac aaagggctgt cggatgoact gatctgcaca gatgacttca ttgccaaagt 1680 
tgttcaaaga tgtatgtcca tccctgtgac gatgagggct attoggagga aagctgaaac 
oattoaagcc gacaccccag cactgtccct cattgoagag acagttgaag acatggtgaa 
aaagaacctg cccccggcta gcagcocagg gtatggcatg acoacaggca acaaocoaat 
gagtggtacc actacatoaa ccaacaoctt tccggggggt cccattgoca cottgtttaa 
tatgagcatg agcatcaaag atcggoatga gtcggtgggc oatggggagg acttoagcaa 
ggtgtctcag aacccaattc ttacoagttt gttgcaaatc acagggaacg gggggtctac 
cattggctcg agtocgaoco ctcctcatca cacgccgcca cctgtotctt cgatggocgg 
eaacacoaag aaccacccga tgctcatgaa octtctcaaa gataatcctg cccaggattt 
ctcaaccctt tatggaagca gccctttaga aaggcagaac tcctcttccg gctcaccccg 
catggaaata tgctcgggga gcaacaagac caagaaaaag aagtcatoaa gattaocaoc 
tgagaaacca aagcaccaga ctgaagatga ctttcagagg gagctatttt caatggatgt 
tgactcacag aaccctatct ttgatgtcaa oatgacagct gacacgctgg atacgocaca 
catcactcoa gctccaagoo agtgtagcac tcccccaaca acttacccac aaccagtacc 
tcacccccaa cccagtattc aaaggatggt ccgactatoc agttcagaca gcattggccc 
agatgtaact gacatccttt cagacattgc agaagaagct tctaaacttc ccagcactag 
tgatgattgc ccagccattg gcacccctct tcgagattct tcaagctctg ggoattctca 
gagtaccotg tttgactctg atgtctttca aactaacaat aatgaaaatc catacactga 
tccagctgat cttattgcag atgctgctgg aagccccagt agtgactctc ctaccaatca 
tttttttcat gatggagtag atttcaatcc tgatttattg aacagcoaga gceaaagtgg 
ttttggagaa gaatattttg atgaaagcag ccaaagtggg gataatgatg atttcaaagg 



120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 



1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
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atttgcatct caggcactaa 
gaccaagttt aagggcaata 
cggcaaagct ttagctcctg 

5 actgaccact ggggacttag 

caccagtaat agtactctct 
gaccccttct aatgatggga 
tgagggaaag totccatctc 
aggtggatot aaatcgccag 
accacooatt coeaaaatca 

10 ttcctctcac agtcagtata 

ccatagcoat tcttcctcct 
taaatcagaa ggttcatcaa 
ttotggatct agccagtcca 
cataaccaag catggactga 

15 gccatoatca cttatgaatc 

gccacctgga ggotctgaoa 
atcctctaaa gccaagtccc 
tagttcaagc tcstggcatga 
gaaaactccc ccatcatota 
ctcttccatg teatcctotc 

20 aaacaagaag ccgtccttga 

tggccctggg ggtgaagacc 
ccatcctatg tcetccaaac 
aagtgataaa gacaaatcaa 
gacctcagag tcaaaaaatg 

25 tgatggaggc tcooetagca 

tggagaaggg cttaggcctc 
tggttccact ccaaagcatg 
cccccagaat etggacagtg 
gaatagtccc agotcagacg 
taagaagcac aaaaaggaaa 

30 caaagaccga gacaagaaaa 

catctcttca gaccagtcct 
ctcaaggctc agcccagact 
cctgattggg aattaggaac 
attgataagt ttataggcaa 

35 agaatcctaa atggcatggc 

ccctattaaa gaaaccacag 
gccagaaaga aagttaaaat 
agagggaagg gagggaaaca 
ttooattttt aggocatgtt 
aaaatgaaaa agcaatacat 

40 aggtagttgt ccgtttgagt 

gtaatgtttg gtgtotgttt 
tatttgtaat attttaaccc 
atgaggttta gggttcaaaa 
agottatagc ttgtgctaat 

45 tttagagtta aatataatag 

agggttcaga gatcatagga 

<210> 5 

5Q <211> 5515 

<212> DNA 

<213> Homo sapiens 



atactttggg ggtgccaatg cttggaggtg ataatgggga 2940 

accaagocga cacagttgat ttcagtatta tttcagtagc 3000 

cagatcttat ggagcatcac agtggtagtc agggtccttt 3060 

ggaaagaaaa gactcaaaag agggtaaagg aaggcaatgg 3120 

cggggcccgg attagacagc aaaccaggga agcgoagtcg 3180 

aaagcaaaga taagcctcca aagcggaaga aggcagaoac 3240 

atagttcttc taacagacot tttaccocac otaccagtac 3300 

gcagtgcagg aagatctcag actcccccag gtgttgccac 3360 

ctattcagat tcctaaggga acagtgatgg tgggcaagcc 3420 

ccagcagtgg ttctgtgtct tootcaggca gcaaaagcca 3480 

cttccteatc tgcttccacc tcagggaaga tgaaaagcag 3540 

gttccaagtt aagtagcagt atgtattcta gccaggggtc 3600 

aaaattcatc ccagtctggg gggaagccag gctcctctcc 3660 

goagtggatc tagcagcacc aagatgaaac cteaaggaaa 3720 

cttctttaag taaaceaaac atatcocctt ctoattcaag 3780 

agcttgccto tccaatgaag cctgttcctg gaactcctcc 3840 

otatcagtta aggttctggt ggttctcata tgtotggaac 3900 

agtcatcttc agggttagga tccteaggct ogttgtocca 3960 

attcctgtac ggcatottcc tcetcctttt ootcaagtgg 4020 

agaaccagca tgggagttct aaaggaaaat ctcceagcag 4080 

cagctgtcat agataaactg aagoatgggg ttgtoacoag 4140 

cactggaogg ooagatgggg gtgagcacaa attcttccag 4200 

ataacatgto aggaggagag tttoagggca agcgtgagaa 4260 

aggtttccac ctccgggagt tcagtggatt cttctaagaa 4320 

tggggagcac aggtgtggca aaaattatca tcagtaagca 4380 

ttaaagccaa agtgactttg cagaaacctg gggaaagtag 4440 

aaatggcttc ttctaaaaac tatggctotc caotcatoag 4500 

agcgtggoto tcccagccat agtaagtcac cagcatatac 4560 

aaagtgagto aggctcctcc atagcagaga aatcttatca 4620 

atggtatccg accacttcca gaataoa g ca cagagaaaca 4680 

agaagaaagt aaaagacaaa gatagggacc gagaccggga 4740 

aatctcatag catcaagcca gagagttggt coaaatcaoo 4800 

tgtctatgac aagtaaoaca atcttatctg oagacagaoc 4860 

ttatgattgg ggaggaagat gatgatctta tggatgtggc 4920 

cttatttcct aaaagaaaca gggccagagg aaaaaaaaot 4980 

accaccataa ggggtgagtc agaoaggtct gatttggtta 5040 

tttgacatoa agctgggtga attagaaagg catatcoaga 5100 

ggtttgattc tggttaccag gaagtcttct ttgttoctgt 5160 

acttgcttaa gaaagggagg ggggtgggag gggtgtaggg 5220 

gttttgtggg aaatattoat atatattttc ttctcccttt 5280 

ttaaactcat tttagtgcat gtatatgaag ggctgggcag 5340 

tccttgatgc atttgcatga aggttgttca aotttgtttg 5400 

catgggcaaa tgaaggaott tggtoatttt ggacacttaa 5460 

cttaggagtg actgggggag ggaagattat tttagctatt 5520 

tttatctgtt tgtttttata cagtgtttcg ttotaaatct 5580 

tgatggaagg ccgaagagca aggcttatat ggtggtaggg 5640 

actgtagoat caagccoaag caaattagtc agagcccgcc 5700 

aaaaaccaaa atgatatttt tattttagga gggtttaaat 5760 

atattaggag ttacctotct gtggaggtat 5810 
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<400> 5 

cttttttccc ttcttcaggt caggggaaag ggaatgccca attcagagag acatgggggc 60 

5 aagaaggaog ggagtggagg agcttotgga actttgcago cgtcatcggg aggcggeagc 120 

tctaacagoa gagagcgtca ccgcttggta tcgaagcaca agoggoataa gtccaaacac 180 

tccaaagaca tggggttggt gacccccgaa gcageatccc tgggcacagt tatcaaaoct 240 

ttggtggagt atgatgatat cagctctgat tccgacacct tctccgatga catggccttc 300 

aaactagacc gaagggagaa cgacgaacgt cgtggatcag atcggagcga cogcctgoac 360 

10 aaacatcgtc accaccagca eaggogttcc cgggacttac taaaagctaa acagaocgaa 420 

aaagaaaaaa gccaagaagt ctccagcaag tcgggatcga tgaaggaccg gatatcggga 480 

agttcaaago gttogaatga ggagactgat gaotatggga aggcgcaggt agccaaaagc 540 

agcagcaagg aatccaggtc atcoaagctc cacaaggaga agaccaggaa agaacgggag 600 

otgaagtctg ggcacaaaga ecggagtaaa agtcatcgaa aaagggaaao acccaaaagt 660 

15 tacaaaacag tggacagccc aaaacggaga tcoaggagcc oooaeaggaa gtggtotgac 720 

agctccaaac aagatgatag cccctcggga gcttettatg gocaagatta tgaccttagt 780 

occtcacgat etcatacctc gagcaattat gactectaea agaaaagtoo tggaagtacc 840 

tcgagaaggc agtcggtcag tccocottao aaggagcctt oggoctaoca gtccagcacc 900 

cggtcaccga gcccctaaag taggogacag agatotgtca gtooctatag caggagacgg 960 

tcgtocagct acgaaagaag tggctcttac agcgggcgat cgcccagtce ctatggtcga 1020 

20 aggcggtcca gcagcccttt ectgagcaag cggtotctga gtcggagtcc actcccoagt 1080 

aggaaatcca tgaagtccag aagtagaagt cctgcatatt caagacattc atcttctcat 1140 

agtaaaaaga agagatccag ttcacgcagt cgtcattcca gtatctcacc tgtcaggctt 1200 

ccacttaatt ccagtctggg agctgaactc agtaggaaaa agaaggaaag agcagctgct 1260 

gctgctgoag oaaagatgga tggaaaggag tcoaagggtt cacctgtatt tttgcctaga 1320 

25 aaagagaaca gttcagtaga ggctaaggat tcaggtttgg agtctaaaaa gttacccaga 1380 

agtgtaaaat tggaaaaatc tgcocoagat actgaactgg tgaatgtaac acatctaaac 1440 

acagaggtaa aaaattctte agatacaggg aaagtaaagt tggatgagaa ctccgagaag 1500 

oatcttgtta aagatttgaa agcacaggga acaagagact ctaaacocat agcactgaaa 1560 

gaggagattg ttactccaaa ggagacagaa acatcagaaa aggagacccc tacacctctt 1620 

cccacaattg cttctccccc accccctcta ccaactacta cccctccaoc tcagacaccc 1680 

30 cctttgccac ctttgcctco aataccagct cttccacago aaocacctot gootocttct 1740 

cagccagcat ttagtcaggt tcctgcttcc agtacttcaa ctttgccccc ttctactcac 1B0O 

tcaaagacat ctgctgtgtc ctctcaggca aattctcagc cccctgtaca ggtttctgtg 1860 

aagactcaag tatctgtaac agctgctatt ccacacctga aaacttcaac gttgcctcct 1920 

ttgccoctcc cacccttatt aootggaggt gatgacatgg atagtocaaa agaaaotott 1980 

35 ccttcaaaac otgtgaagaa agagaaggaa cagaggacac gtcaottact cacagacott 2040 

octctccctc cagagctccc tggtggagat ctgtctoccc cagactctcc agaaccaaag 2100 

gcaatcacac oacctcagoa accatataaa aagagaooaa aaatttgttg tcctcgttat 2160 

ggagaaagaa gacaaacaga aagcgactgg gggaaacgct gtgtggacaa gtttgacatt 2220 

attgggatta ttggagaagg aacctatggc caagtatata aagccaggga caaagacaca 2280 

ggagaactag tggctctgaa gaaggtgaga otagacaatg agaaagaggg cttcccaatc 2340 

40 aoagcoatto gtgaaatcaa aatccttogt cagttaatoc accgaagtgt tgttaacatg 2400 

aaggaaattg toacagataa acaagatgca ctggatttca agaaggacaa aggtgccttt 2460 

taccttgtat ttgagtatat ggaccatgao ttaatgggac tgctagaatc tggtttggtg 2520 

cacttttctg aggaocatat caagtcgttc atgaaacagc taatggaagg attggaatac 2580 

tgtcacaaaa agaatttcct goatcgggat attaagtgtt ctaaoatttt gctgaataac 2640 

45 agtgggcaaa tcaaaotagc agattttgga cttgotoggc tctataactc tgaagagagt 2700 

cgcccttaca caaacaaagt cattactttg tggtaccgac ctccagaact actgctagga 2760 

gaggaacgtt acacaccago catagatgtt tggagctgtg gatgtattct tggggaacta 2820 

ttcacaaaga agcctatttt tcaagccaat otggaactgg ctcagctaga actgatcagc 2880 

cgactttgtg gtagcocttg tccagotgtg tggcctgatg ttatcaaact gccctacttc 2940 

aaoaooatga aaccgaagaa gcaatatcga aggcgtotac gagaagaatt ototttcatt 3000 

50 ccttctgcag oacttgattt attggaccac atgctgacac tagatcotag taagcggtgc 3060 

acagctgaac agacoctaoa gagcgacttc cttaaagatg togaactcag oaaaatgget 3120 

ootcoagaco tcccccactg gcaggattgc catgagttgt ggagtaagaa acggcgaogt 3180 

cagcgacaaa gtggtgttgt agtcgaagag ccaoctccat ccaaaacttc tcgaaaagaa 3240 

actacctcag ggacaagtac tgageotgtg aagaacagca gcccagcaoo acctcagcct 3300 

55 gctcctggca aggtggagtc tggggctggg gatgcaatag gccttgotga oatcacacaa 3360 
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tgtgccattt gtatctgtgt ctggcccaat gagagtgttg aaaggtgagc oacaagataa 720 
aacagcaact tcetacctcc cttatcaaga cagctgtctg acctacctcc ccttggocac 780 
tcttgggatt actggggttg gcttcagtat tttcagattt ttoagaaggg gaggagaatg 840 
cttgagtctc atccaggaao ttaggcagtt ctcagcactg cctgctcctc ctccctcaaa 900 
taaccaagtc tgaagaccag gagagaaagc cgctggtgga ctggtcacct gtctggoagt 960 

1020 
1080 
1140 
1200 
1260 
1320 
1380 



gggaggagga gagtgagagg tttctaggta ggaatccaga cttagaccct cccctccacc 
cccagatggg tggtgcacag gctcatctcg cggcccctcc ccactccacc ctaacatgga 
tacgccccca acaaccaagg aaagatctcc catcggctga ctecacagat acacacatgt 
ccccacagac acacacacgc ccatgcagag gcacagacat ccaggcacat etttcccttt 
otctgtcttt oocttggttt gaatttogtt tagccacata tgttgtgtgt gcgtgagggt 
gggtggggga ggggcagaca gggatgaggg atggcatggt gccaacatct aoetatgggg 
ctcgggccag ggacgcccct fcacagccatc otgggagggg gtctcagetg tccctttgtg 
gccaagggga ccctcctggg gagtgggggc aagcacagag gtcctttctc occaacocgg 1440 
ggtctggtcc ctgacccacc ttgggggcct gcaggggagg aaatggacag agcgggaccc 
tgagggagca tagaattggc caccacgagc ccccagtgto oagocttgcc aocccattgt 
tcccgtgagg gggtetotat atacaggggg caactootcc caoottccto tcaatccotg 
otttocotgc gttgggoggg gaggggaggg cggcagaaat atttatttat ttootttatt 
tatttaattt tttttttttt tttttggagt agagagtgac agatggcggc gggtcccggg 
ggagccggct otcocccagt gcagacgcat gccaatcacc gtctotcatg tgatagctgc 
tgooogtgac gtgccaagcc catatggcct ggcatagagg ctggtacccc gcetggtaga 
gatgccacac tcgctccgcg gttcgcatgg egctctgaag aogocggcgc ccgcogcctt 
gaggagccgc tgcccccgct acetgaagat gggggaaoaa tgaaataagc gagaagatcc 
ctottetccc ccctctctct cttgcccoct coccccctcc cctcccctct ooocttgaot 
cctctccgag gtaagttgtc cgaaagggag cgagatetga eocgccggtt gggaggaggg 
gcggeagett cggccgacag gagggtcctc aaatacctcc ttcctgggat gatgcccccc 2160 
tcattgggtg ggcatcggag gggccccagg ttctctctcc ottagggget goagcccagg ''™ 
gggctgcaga ggaggtgtct ctgcetgoga tgggctoggt ggggggggaa ggcaggatoa 
cggaggggga tatgcgaaga ggocgagacg gaggacccct ccatggttgt cccaaaaagc 
ctgccacctt tccccaccac cgaaaaaagg gaagcaaaca aacaaatttg gatttttccc 
ccatcaatcc caaaatacaa cgagatetga agagecttgt gggagggagt cagcttgaag 
ggggaagggg gtccctgacc gcagagggga eggactggge tcgcttctct cagtctcctc 2520 
cccacgccoc gotgettcag tcetcgccgc ccagagccgg ctccgggagc tggggacgoa '" =01 
teggetagag gagacgatcc tcccgcctct ggaattgggg gtgcgggggt gggggecgag 
caaggggogg cgcgcagcca agttgcaaat tggattaggg agcgtggggg tgagagecac 
gggaggggtg agggagctgg geegggggge ccgggccgcg agagegegga geggggcage 
tgtccccacc ggcggccgao oagectotot ccaccgccag gagagaaegg gctttcaggg 



1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 



2220 
2280 
2340 
2400 
2460 



2580 
2640 
2700 
2760 
2820 

cgagcgcgcc gcotcccctg gcaaagatat ctggtccota aaacccooac coggtccctg 2880 
ccctgaccct gagaagaagc aggegegggg agcagcccoo cattcaagcg aggggeggag OQ/m 
ccggggccca gcgccgggga gagggectgg gecgagatco caggccggca geegggtagg 
getgggcegg etctgggegg ggcaggegge ggaggtgggc atccagggta gcotaggoag 
gagooogcac gagacteggg ggtggaggag ggttgtgggg gggcgtoggt accccagcgo 
gcccctcact ttgtgctgtc tgtctcocot tcccgcccgc ggggcgccct caggcaccat 
gctgacccgc ctgttoagcg agcccggcct tctctcggac gtgcccaagt togecagctg 
gggegaegge gaagacgacg agecgaggag cgacaagggc gaogegooge caccgccaoc 
gcctgcgccc gggccagggg ctccggggco agcccgggcg goeaagecag tooctctccg 
tggagaagag gggaoggagg coacgttggc cgaggtcaag gaggaaggcg agotgggggg 
agaggaggag gaggaagagg aggaggaaga aggactggac gaggoggagg gegageggoe 
caagaagogc gggeccaaga agegcaagat gaecaaggeg cgcttggagc gctccaagct 
teggeggcag aaggcgaacg cgcgggagcg caaccgcatg cacgacctga acgcagccot 
ggaoaacctg cgcaaggtgg tgccctgcta ctccaagaog oagaagotgt coaagatcga 
gaegctgoge etagecaaga aotatatctg ggogctctcg gagatoctgc gctccggcaa 
gcggccagac ctagtgtcct aegtgeagae tctgtgcaag ggtctgtcgc agcccaccao 
caatctggtg gccggctgtc tgcagctcaa ctctcgcaac ttcctoaogg agoaaggege 
cgacggtgcc ggccgcttcc aoggctcggg cggccogttc gccatgcacc cctacccgta 
cccgtgctcg cgcctggcgg gcgoacagtg ccaggcggcc ggcggcctgg gcggcggcgc 
ggogoacgee ctgcggaccc aeggctactg cgccgcctac gagaogotgt atgeggeggo 
aggcggtggc ggcgcgagcc eggactacaa cagctccgag tacgagggee cgctcagccc 
cccgctctgt ctcaatggca aottctcact caagcaggac tcctcgcccg aooacgagaa 
aagctaocac tactctatgc actactoggc gctgcccggt tcgogecacg gecaegggot 



2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
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agtcttcggc tcgtcggctg tgcgcggggg cgtccactcg gagaatctct tgtottacga 
tatgcacctt cacoaogaoc ggggcoocat gtacgaggag ctoaatgcgt tttttcataa 
ctgagaotto gogccggctc ccttcttttt cttttgcctt tgcccgoccc cctgtcccoa 
gcccccagoa gcgcagggta cacccccatc ctaccccgge gccgggcgcg gggagcgggc 
caccggtcct gccgctotcc tggggaagcg cagtcctgtt acctgtgggt ggcctgtccc 
aggggcctcg cttcccocag gggactcgcc ttctctctcc ccaaggggtt ccctcotcct 
ctctcccaag gagtgcttct ccagggacct ctctccgggg gctocctgga ggcacccctc 
ccccattooo aatatottcg ctgaggtttc ctcctccccc tcctccctgc aggcccaagg 4680 
cgttggtaag ggggcagctg agcaatggaa cgcgtttccc cctctcatta ttattttaaa 4740 
aacagacacc cagctgooga ggcaaaaagg agccaggcgc tccctcttto ttgaagaggg 4800 
tagtattttg ggcgccggag cccgggcctg gaaogccctc acccgcaacc tccagtctcc 4860 

■ • • ' — - - 4920 

4980 



gcgttttgcg attttaattt tggcgggagg ggaagtggat tgagaggaaa gagagaggcc 
aagacaattt gtaactagaa tocgtttttc ccttttcctt tttttaaaoa aacaaacata 



agtccggatc ggagagaaaa cgcagtaagg acttttagaa gcaataaaag gcaaaaaaaa 



<210> 7 

<211> 2020 

<212> DNA 

<213> Homo sapiens 



<400> 7 

gctaotgagg cogcggagoc ggactgoggt tggggcggga agagccgggg ccgtggctga 
oatggagcag ccctgctgct gaggccgcgc cctccccgcc ctgaggtggg ggcooacoag 
gatgagcaag ctgcccaggg agctgacccg agacttggag ogcagcotgo ctgccgtggc 
ctccctgggc tcctcactgt cccaoagoca gagoctctcc togcacctcc ttccgccgcc 
tgagaagcga agggccatct ctgatgtccg ccgcaccttc tgtctcttcg tcaccttcga 
octgctctto atctccctgc tctggatcat cgaactgaat accaacacag gcatccgtaa 
gaacttggag caggagatca tccagtacaa ctttaaaact tccttcttog acatctttgt 
cctggcotto ttccgcttot ctggactgct cctaggctat gccgtgctgc agctccggca 
ctggtgggtg attgcggtca cgaogctggt gtccagtgca ttcctcattg tcaaggtcat 
cctctctgag ctgctcagca aaggggoatt tggctacctg ctccccatcg tctcttttgt 
cctcgcctgg ttggagacct ggttccttga cttcaaagtc ctaccccagg aagotgaaga 
ggagcgatgg tatcttgccg cccaggttgc tgttgcccgt ggacccctgc tgttctccgg 
tgctctgtcc gagggacagt tctattcaec cccagaatcc tttgcagggt ctgaoaatga 
atcagatgaa gaagttgctg ggaagaaaag tttctctgct caggagcggg agtaoatccg 



4260 
4320 
4380 
4440 
4500 
4560 
4620 



■ aagctaagag gcgacggaag ccgaacgcag 5040 



5100 

caaaaaacaa aaaaacaaac aaaaaaaaac cactactacc aataatcaaa gacacaaata 5160 

5220 
5280 
5340 
5400 



tctatgcaag gaggctccac tgagcctcgc ggccoggccc ggccocggga tgceccgccc 
ggcctgcggg ccgccccgcc cgagcgcgga tetgtgcact ttggtgaagt gggggecogo 
gccgccccot ccccctooce aggttcttac aatoagtgac tcggagattt ggggccccag 
tgccactgoo etcccecgcc ccgtccccgt tgtgcgtcat gctgtttttt aaaaacctgt 
ttccaaattt gtatggaatg gcaaaetgtt ggggggtcgg tttggggagg gagggtttgc 5460 
atgaaagaca cacgcacacc acaccgcacg caoaagcagg cccggogocg gcgtccgggg 
ggcagaagga ggtgagetcg ccggctectc ctccccgcgg coattctgtc coctootggg 
gtgaggggtg gggatggaga cctgggggca gceccacccc tgocoggact gtgoctcggt 
gggtgccacc tggcgatttc eggtgtetgg agagagtatt ttttggtcca aggagtcctc 
ttggetttag ctggtgggtg ggcggggaga ggtctgaggg ctcctactgg aggttocccc 
aaaaaggggo aaaaggagac cctctgccca ccggaggcag gggatcaggc atccaaatac 5820 
acgatgcaaa aatgcaatcc eaoaggegac acacccacao actcacccac acacacgcaa 5880 
ttttacctto ctcttgtagc gaagatgaaa ctcccgtcgg acacccgaag tgcattgcgt 5940 
gtttctgttc agtttaatga cgattaataa atatttatgt aaatgagatg caaagocgga 6000 
coggtttctc acggtggcct catttcattg aggggggaga gaaggtttga gctggggctg 6060 
gggtgatgaa ggcagagtgt caagtgactg tgcagaggco aaacagaggg acttoccagc 6120 



5520 
5580 
5640 
5700 
5760 



120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
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ccaggggaag gaggooaogg cagtggtgga 
gtttgagaag aataatgaat atggggacac 
caagacgttt atcctgaaga ccttcctgcc 
gatcctgeag ccogagagga tggtgctgtg 
gcagcgagtg gaagacaaca coctcatctc 
cgtggtctoc ccaagggact tcgtgaatgt 
cttgtcatca gggatcgcoa cotcaoacag 
gggagagaat ggcootgggg gcttcatcgt 
cacctttgtc tggattctta atacagatet 
ccagagcctc gcggccacca tgtttgaatt 
gctgggggoc cgggogtgac tgtgccccct 
ccacttccag agccagaaag ggtgcoagtt 
ccaggctgtc accctccace gagccacgca 
tggggtggag cactggactc cggggcccca 
gatgtttaca tggcgccctg eotcctggag 

gcagggtctg ggctgggcac ctgacttggc 
ggcagcctgt oacccgtgtg aagatgaagg 
ttttttagga ttattgaaag agtctgggae 
tgggctgctg gccatgaatc tctgcctctc 
tgggggaoot ttgtattaag ccaattaaaa 

<210> 8 

<211> 1730 

<212> DNA 

<213> Homo sapiens 



ccagatcttg gcccaggaag agaactggaa 
cgtgtacacc attgaagttc cctttcacgg 
ctgtcctgog gagotcgtgt accaggaggt 
gaacaagaoa gtgactgcct gccagatcct 
ctatgacgtg tctgoagggg otgogggcgg 
ccggcgcatt gagcggcgca gggaccgata 
tgccaagccc ccgacgcaca aatatgtccg 
gctcaagtcg gccagtaacc occgtgtttg 
caagggocgc ctgccccggt acctcatcca 
tgcotttcac ctgcgaoagc gcatcagcga 
eecaocctgc gggccagggt cctgtcgcoa 
gggctcgcac tgcccacatg ggacctggco 
gtgcctggag ttgactgact gagcaggctg 
ctggctggag gaagtggggt otggoctgtt 
gaccagattg ctctgcooca ccttgccagg 

tggggaggac cagggccctg ggoagggoag 
ggctctteat ctgcctgcgc tctcgtcggt 
ccttgttggg gagtgggtgg caggtggggg 
ccaggctgtc cccctcctcc cagggcctcc 
acatgaattt 



<4oo> e 

gtggtgaggg tgactgggga 
ttgcactctc ccttctgggg 
gcaggacctg gcagccctgg 
tctgtttaca gacaasctgc 
ktgccagggg ggcacaaggc 
aggcacctgg aggcattgac 
cagcgtccag gtaccccaac 
gctctactta tagcatctga 
agggggctat ttaaagggcc 
agagctgagc tgcgaggtgt 
atggaaggat ctgacactgt 
tccctgcctc tgctccccca 
tggagaaatt tctgggggtg 
cacccttcct gcccccaggt 
gagagcaaca gctcccagga 
aggaggacac ccagagacat 
agcgctcgcc ctggctgatg 
agctgcccta ccagcgggta 
ccaaggagga gcgtgaggac 
ccctgggtgg ccagtgtgtg 
ctgtggtgcc tgtcagcaag 
aggaagcaca gagaggctga 
gctgggccct tcctggctag 
tagtttgccc agagttgggg 
ccctgagttc ccaaagggag 
gccagcacag ctgaggaccc 



ctaggcacta ggcctttggt 
atatgccctt gagcccaggc 
tacagagccc agagggggca 
tgtcctccct gcaaagggga 
tgggcatgtg gctggcatga 
cccaggacct tggaccccag 
ccctgccctg ggtccggcgt 
caccagaggg gccgaaaata 
tgggagggga gagagaatga 
cggaggagaa ctgtgagcgc 
ccacacggcc cgaggagggg 
gagcaccctc actgagccat 
ggggcaggaa gaatgcccca 
cccagcagcc caggggagcc 
gctcactgcc cctcccctct 
gagacctacc accagcaggg 
atgcggatgg gcatcctcgg 
ctgccgctgc ccatcttcac 
acccccatcc agcttcagga 
gaccgccagg aggtggctga 
cccggtgcmc ttcgtcgctc 
gagggactgt gacttgggct 
gacctgtgga ggggcagctc 
gctaggggag gggggagcca 
ggtggcagag acagtgggca 
tcagccccag gagaagggac 



gcaggcgcct gaggacktgg 
agaggagagc acagcccagg 
tcagttcctg ctggtcctgc 
gtgggtgggg cagagggcaa 
gacggtgtct gagtaatgtc 
acctctgacc gtggggcagc 
ccccccatta gtgagtcttg 
gcccctggag aagggggagg 
ggagtgatca tggctacctc 
cgggaggcct tctgggcaga 
tgagtgtggg tctgctagag 
gaggccagag catgaagccc 
tggggagagc aaaggggaac 
ccccacccag cctgtgccca 
ccccagctgc tccctgcatg 
gcagtgccag gtgetggtgc 
ccgtgggctg caggagtacc 
ccctgccaag atgggcgcca 
gctgctggcg ctggagacag 
gatcacaaag cagctgcccc 
cctgtcccgc tccatgtccc 
ccgctgtgcc cgccccctgg 
gctggcccat ggctgctttg 
gaggccagga tgcctgagcc 
ctaagggtgg agagttgggg 
aaaaggtact ggtgagggca 



86 



EP 1 365 034 A2 



agaggtgcct gggaggagtg gcoctgatcc aggaaaatgt gaggggaatc tggaaogctc 
taggcagaag aagctgggag ggagggggag gtgaaaaggg cagaggoaag gatggtgggg 
cccccagcao ectotgttag tgocgcaata aatgctcaat catgtgccag 



1620 
1680 
1730 



<210> 9 

<211> 3799 

<212> DNA 

<213> Homo sapiens 



<400> 9 

ctggcaotgg gtggtaacca gcaagccagc tggeatccgo atccagggtt tgtttcaatg 
atgtctcgtg gagaatatgg aggggctggt gccaggactg tccttggctt tgcctcgggg 
tgtgaaoggg gtoagtgaec tctaaaacta acetgcctct cagttctgaa tccagacaga 
atcaatcctc agctgtgtct cgetceacac eoeotgccct ggaagcoagg gaaggttgga 
ggtgctaggg ggtoaggotc ccctctgtga cccctgcago tgttgtggtg actcatgtoc 
caacotagct gcctotccca aggagacttt cccctgggao aagggggagg gaatggcatg 
gaggaggcac acatcaagcg gggeoaggaa oeeacggtgg caggagctgg gctggtgacc 
tacccagggc agaagggccc gggactcatc cagaggggaa ggaaggggtc ttcaggaaga 
coacggagat gccaoaggca gaattggctt eccatctggg agataggtgg ggagaccetg 
gcattttgac ageeagaacc tggggtgctg agcagaatct tcatgcctgg cctggccgcc 
ttcggaggga agatggaggg ttgggtgcga gaggagtggg gtcagagooc ctaoatccgc 
aggaccceaa atcggctggg coccaaggcc cggactgogo tccocggtgg ccocggcggc 
cctccgcgaa tgcgtectge oootcecctg eccaagccct ctgccctcac ccgggtccgg 
cgccgccocc gaagtggcgg gaacaacccg aacccgaacc ttctgtcctc gggagccccc 
agataagcgg ctgggaaccc gcggggcccg caggggaggc ccggctgttc cgcccgctaa 
gtgcattagc acagctcacc tccootatcg cgcctgccat cggacgggca gtgccgcgcc 
ctgctctggg gcccocggag cgaccacagc ggaggcogga acggactgtc ctttctgggg 
cggggtgggg agggggtgtc gctggagggo ccggtggcat agcaacggac gagagaggcc 
tggaggaggg gcggggaggg ggagttgtgt ggcagttcta agggaagggt gggtgctggg 
acgggtgtcc gggagggagg ggagcctggc ggggtctggg gcctcgtcgc ggagggcgct 
gcgaggggga aactggggaa agggcotaat tccccagtot ccacctogaa tcaggaaaga 
gaaggggcgg gctgotgggc aaaagaggtg aatggctgcg gggggotgga gaagagagat 
gggaggggoc ggccggcggg ggtgaggggg tctaaagatt gtgggggtga ggaactgagg 
gtggggggcg cccagaggcg ggactcgggg oggggoaggo gaggcggagg gcgagggctg 
cgggagcaag taoggagccg ggggtgtggg ggacgattgc cgctgcagoc gccgccccao 
tcacctcegg tgtgtctgca gcccggacac taagggagat ggatgaatgg gtggggagga 
tgoggogcac atggocccgg gcggctcggc ggtcagotgc cgoocccaca gcggaccggt 
cggggcgggg gtcgggcggt agaaaaaagg gccgogaggc gagcggggca ctgggcggao 
cgcggoggca goatgagcgg cgcagaccgt agccccaatg cgggogcago ccotgactcg 
gccoogggcc aggcggcggt ggcttcggcc taccagcgct tcgagccgcg cgcctaccto 
cgcaacaact acgogccocc tcgcggggac ctgtgcaacc cgaacggcgt cgggccgtgg 
aagotgcgct gcttggcgca gaccttcgcc accggtgagc gggggaaact gaggoacgag 
ggacaagagg tcgtcgggga gtgaaagcag gcgcagggaa ataaaaagaa ggaaagggag 
acagaocagg cgcctaaoag atggggacca agaaacaaga gatagctgag aggtgcaaac 
agaagagaaa aaggagoaac atcccttagg agaggggoag aggagagaga ggtggagaga 
gggggcggag agtgctcaga attgagagct aaggtggggg atgcaggaca gactgaggtg 
gagatgcata ggaggaaatg gaggcagatg tgggacaggg gtgagaaact ccaggatttc 
ctcgctgagc ctggotggta ggtatagttg ttttctttct ttttctttat tttattttca 
tttatttact tatttttatt ttttatttgt tttgagacgg agtttcgctc ttgttgccca 
ggctggagta caatggcgcc atctcggoto actgcaaoct ccgcctcccc gggttcaago 2400 
gattctottg cotcagottc cctagtagct gggattaoag gcatgogoco coatgcctgg '- t f>« 
ctaatttatt tgtattttta gtagagacgg gacttctcoa tgttggtcag gctggtctog 
aactcccaac cttaggatoo acccaccccg gcctcccaaa gtgctgggat tacaggtgtg 
agocactgcg oocggccagt aggtatagtc ttctagatgt gaaaoetgag totcagagog 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 



2460 
2520 
2580 
2640 



87 



EP 1 365 034 A2 



gtgaagttco cttccgaagg gcagcccatg ttggagctgg gttcagtcta actctggggc 
oaatgotttt tocagatgga gacacatttg cagaggagaa ggaagaacta gagagaggca 
gggagatgca ggggagggaa gggtaaggag gcaggggotg cctgggctgg ctggoaccag 
gaccctcttc ototgocotg occaggtgaa gtgtccggac gcaccctcat cgacattggt 
tcaggcccoa ocgtgtacca gctgctcagt gcctgcagcc actttgagga catcaccatg 
acagatttcc tggaggtoaa ccgccaggag otggggcget ggotgcagga ggagocgggg 
gcottcaact ggagcatgta cagccaacat gcctgcotca ttgagggcaa ggggtaagga 
otggggggtg agggttgggg aggaggcttc ccatagagtg gctggttggg gcaacagagg 
cctgagogta gaacagcctt gagccctgce ttgtgcctcc tgcacaggga atgctggcag 
gataaggagc gccagctgcg agcoagggtg aaacgggtcc tgcccatcga cgtgcaccag 
occcageccc tgggtgctgg gagcccagot cccctgcctg ctgacgccct ggtctctgoc 
ttctgcttgg aggctgtgag cccagatctt gccagctttc agogggccct ggaccacatc 
accacgctgo tgaggootgg ggggcaccta ctoctcatcg gggccctgga ggagtcgtgg 
taoctggotg gggaggccag gctgacggtg gtgccagtgt ctgaggagga ggtgagggag 
gocctggtgo gtagtggcta caaggtccgg gacctocgca cctatatcat goctgcccac 
cttcagacag gcgtagatga tgtcaagggc gtcttcttcg cctgggctca gaaggttggg 
otgtgagggc tgtaoctggt gccctgtggc ooocacecac ctggattccc tgttctttga 
agtggcacct aataaagaaa taataccetg ccgctgoggt oagtgctgtg tgtggctctc 
ctgggaagca goaagggccc agagatctga gtgtccgggt aggggagaca ttcaecctag 
gotttttttc cagaagctt 



<210> 10 

<211> 4530 

<212> DNA 

<213> Homo sapiens 



<400> 10 

aattctcgag ctcgtcgacc ggtcgacgag ctcgagggte gacgagctcg agggcgcgcg 
ccoggccccc aoccotcgca gcaccccgcg ccccgcgccc tcccagccgg gtccagcogg 
agccatgggg ccggagccgc agtgagcacc atggagctgg cggccttgtg ccgctggggg 
ctcctcctcg ccctcttgcc ccccggagac gcgagcaccc aagtgtgcac cggcacagac 
atgaagctgo ggctccctgc oagtcccgag acccacctgg acatgctccg coacctctao 
cagggctgcc aggtggtgca gggaaacctg gaactcacct acctgcccac caatgccagc 
ctgtccttcc tgoaggatat ooaggaggtg cagggctacg tgctcatcgc toacaaccaa 
gtgaggcagg tcccactgca gaggctgcgg attgtgcgag gcacccagct ctttgaggac 
aactatgccc tggcogtgct agacaatgga gacccgctga aoaatacoac ccctgtcaca 
ggggcotccc caggaggcct gogggagctg cagcttcgaa gcctcacaga gatcttgaaa 
ggaggggtct tgatooagcg gaacccccag ctctgctacc aggacacgat tttgtggaag 
gacatcttcc acaagaacaa ccagctggct ctcacaotga tagacaccaa ccgctctcgg 
gcctgccacc cctgttctcc gatgtgtaag ggctcccgct gctggggaga gagttctgag 
gattgtcaga gcotgaogcg oactgtctgt gccggtggct gtgcccgotg caaggggcca 
ctgoocactg actgctgcca tgagcagtgt gctgccggot gcacgggccc caagcactct 
gactgcotgg cctgcctcca cttcaaccac agtggcatct gtgagotgca ctgoccagcc 
ctggtcacct aoaacacaga cacgtttgag tccatgccca atcccgaggg ccggtataoa 

m-a ~j — t — j — .„ „ + , f^^m."h^pr> t-t-*-f?feaecTCTa GCTtCTOCTatcsc 



2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3799 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 



ttcggogcca gotgtgtgac tgcctgtccc tacaactaco tttotaogga cgtgggatcc 1080 

tgcacootcg tctgccccct gcacaacoaa gaggtgacag cagaggatgg aacacagcgg 1140 

tgtgagaagt goagoaagoc ctgtgcccga gtgtgctatg gtctgggcat ggagcacttg 1200 

cgagaggtga gggcagttac cagtgccaat atccaggagt ttgotggctg oaagaagatc 1260 

tttgggagcc tggcatttct gccggagagc tttgatgggg acooagcctc caacactgcc 1320 

ccgctccagc cagagcagot ocaagtgttt gagactctgg aagagatcac aggttaccta 1380 

tacatotcag catggccgga oagcctgcct gacctcagcg tottccagaa cctgcaagta 1440 

atccggggac gaattctgca caatggcgcc tactcgotga ccctgcaagg gctgggcatc 

agctggctgg ggctgcgcto actgagggaa ctgggcagtg gactggocct catccaccat 
aaoacccacc tctgcttcgt gcaoacggtg ccctgggacc agotctttcg gaacccgcac 



1500 
1560 
1620 



EP 1 365 034 A2 



caagctotgc tccacactgo caaccggcca gaggaogagt gtgtgggcga gggcctggcc 1680 

tgccaccagc tgtgcgcccg agggcactgc tggggtccag ggoocaccca gtgtgtcaac 1740 

tgcagccagt tccttcgggg ccaggagtgc gtggaggaat gccgagtact gcaggggcto 1800 

5 cccagggagt atgtgaatgc caggcaetgt ttgcegtgcc accctgagtg toagccccag 1860 

aatggctcag tgaeotgttt tggaccggag gctgaccagt gtgtggcctg tgccoactat 1920 

aaggaccctc ccttctgcgt ggcccgctgc cccagcggtg tgaaaoctga cctctcotac 1980 

atgcocatot ggaagtttcc agatgaggag ggcgcatgcc agcottgccc oatcaactgc 2040 

acocactcot gtgtggacct ggatgacaag ggctgooccg ccgagcagag agccagccct 2100 

ctgacgtcca tcgtctctge ggtggttggc attctgotgg tcgtggtctt gggggtggtc 2160 

10 tttgggatoc toatcaagcg aoggcagoag aagatocgga agtaoacgat gcggagactg 2220 

ctgcaggaaa cggagotggt ggagccgctg acacotagcg gagcgatgcc caaccaggcg 2280 

cagatgcgga tcctgaaaga gacggagctg aggaaggtga aggtgcttgg atotggogot 2340 

tttggcacag tctacaaggg catetggato octgatgggg agaatgtgaa aattccagtg 2400 

gccatcaaag tgttgaggga aaacacatcc cccaaagcca acaaagaaat cttagacgaa 2460 

15 gcatacgtga tggctggtgt gggctcccca tatgtetcoc gccttctggg oatctgcotg 2520 

acatccacgg tgcagctggt gacacagott atgooetatg gctgcctctt agaccatgtc 2580 

cgggaaaacc gcggacgcct gggctcccag gacctgctga actggtgtat gcagattgcc 2640 

aaggggatga gctaactgga ggatgtgcgg ctcgtacaoa gggacttggc cgoteggaac 2700 

gtgotggtca agagtceeaa ocatgtcaaa attacagact tcgggctggc tcggctgctg 2760 

gacattgacg agacagagta ccatgcagat gggggcaagg tgcccatcaa gtggatggcg 2820 

20 ctggagtcca ttctccgceg gcggtteaec oaccagagtg atgtgtggag ttatggtgtg 2880 

actgtgtggg agctgatgac ttttggggcc aaaccttacg atgggatcoc agcccgggag 2940 

atccctgacc tgotggaaaa gggggagcgg atgocccagc cccccatctg caccattgat 3000 

gtctacatga tcatggtcaa atgttggatg attgactctg aatgtcggec aagattccgg 3060 

gagttggtgt ctgaattcte ocgoatggcc agggaccccc agcgctttgt ggtcatccag 3120 

25 aatgaggact tgggcccagc cagtcccttg gacagcaoet tctaccgctc actgctggag 3180 

gacgatgaca tgggggacct ggtggatgct gaggagtato tggtaoccca goagggottc 3240 

ttetgtceag accctgcccc gggcgctggg ggcatggtcc accacaggca ccgcagctca 3300 

tctaccagga gtggcggtgg ggacctgaca ctagggctgg agccctctga agaggaggcc 3360 

cccaggtctc cactggcacc ctccgaaggg gctggctccg atgtatttga tggtgacctg 3420 

ggaatggggg cagccaaggg gctgcaaagc ctccccacac atgaccccag ccctctacag 3480 

30 cggtacagtg aggaccccac agtacccctg ccctctgaga ctgatggcta cgttgccccc 3540 

etgacctgoa gccoocagoc tgaatatgtg aaccagccag atgttcggcc ccageoooot 3600 

tcgcoccgag agggccctct gcctgotgcc cgacctgotg gtgccactct ggaaagggcc 3660 

aagactctct ccccagggaa gaatggggtc gtcaaagacg tttttgcctt tgggggtgcc 3720 

gtggagaacc ccgagtactt gacacoooag ggaggagctg cocctcagcc ccaocotoot 3780 

35 octgoottca gcccagcctt cgacaaootc tattactggg accaggacco accagagcgg 3840 

ggggctccac ccagcacctt caaagggaca cctacggcag agaacccaga gtacctgggt 3900 

ctggacgtgc cagtgtgaac cagaaggcca agtocgoaga agccctgatg tgtcctcagg 3960 

gagcagggaa ggcctgactt ctgctggcat caagaggtgg gagggccctc cgacoacttc 4020 

caggggaacc tgccatgcca ggaacctgtc ctaaggaacc ttocttcctg cttgagttcc 4080 

40 cagatggctg gaaggggtcc agoctcgttg gaagaggaac agcactgggg agtctttgtg 4140 

gattctgagg ccctgcccaa tgagactcta gggtccagtg gatgccacag cccagcttgg 4200 

ccotttcctt ccagatcctg ggtactgaaa gccttaggga agctggcctg agaggggaag 4260 

oggocctaag ggagtgtcta agaacaaaag cgaoocattc agagactgtc cctgaaacct 4320 

agtaotgccc cccatgagga aggaacagca atggtgtcag tatccaggct ttgtacagag 4380 

tgcttttctg tttagttttt actttttttg ttttgttttt ttaaagaoga aataaagaec 4440 

45 oaggggagaa tgggtgttgt atggggaggc aagtgtgggg ggtocttotc oacacccact 4500 

ttgtccattt geaaatatat tttggaaaac 4530 

<210> 11 

50 <211> 2205 

<212> DNA 



<213> Homo sapiens 

55 



EP 1 365 034 A2 



<400> 11 

cacagggctc occcoogcct ctgacttctc tgtccgaagt cgggacaccc tcotaccaoo 
tgtagagaag cgggagtgga tctgaaataa aatccaggaa tctgggggtt octagacgga 
gccagaottc ggaacgggtg tootgctact octgctgggg otcctccagg acaagggcao 
acaactggtt cogttaagoc cctctctcgc tcagacgcca tggagctgga tctgtctcca 

octcatctta gcagctctcc ggaagacctt tggccagccc ctgggacooc tcctgggact 
ccccggcccc ctgataoccc tetgcctgag gaggtaaaga ggtcccagcc tctcotcatc 
ecaaccaocg gcaggaaaot tcgagaggag gagaggcgtg ccaoctccct cocotctatc 
cccaacccct tccctgaget ctgcagtcct ccctcacaga gcccaattct cgggggcccc 
tccagtgcaa gggggctgct cccocgcgat gccagccgcc cccatgtagt aaaggtgtac 
agtgaggatg gggcctgcag gtctgtggag gtggcagcag gtgccaoagc tcgcoacgtg 
tgtgaaatgc tggtgcagcg agctcaogcc ttgagcgacg agacctgggg gctggtggag 
tgccaooooc acctagcact ggagcggggt ttggaggacc acgagtcogt ggtggaagtg 
caggctgoot ggcccgtggg cggagatagc cgcttcgtct tceggaaaaa cttogccaag 
tacgaactgt tcaagagete ccoacactcc etgttoccag aaaaaatggt ctccagetgt 
ctcgatgoac acactggtat ateccatgaa gacctcatcc agaacttcet gaatgctggc 
agctttcctg agatcoaggg ctttctgcag ctgcggggtt eaggacggaa gctttggaaa 
cgctttttct gtttettgeg cegatctggc atatattaot ccaccaaggg cacctctaag 
gatccgaggc aeotgoagta cgtggcagat gtgaacgagt ccaacgtgta cgtggtgacg 
cagggccgca agctctacgg gatgcccaot gaettcggtt totgtgtoaa gcccaacaag 1140 

. .. . — -~ 1200 

tgccggccgg ctgccttccg uomumaay ""^sssu 1 - "icn ^ 

eaggcacagt ctcgecatct gcatcoatct tgtttgggct oeccaccctt gagaagtgcc 

tcagataata ccctggtggc catggacttc tetggocatg ctgggcgtgt cattgagaao 

ccecgggagg ctctgagtgt ggccctggag gaggcccagg cctggaggaa gaagaeaaac 1440 

1560 



120 
180 
240 

300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 



cttcgaaatg gacaoaaggg gcttcggatc ttctgoagtg aagatgagca gagccgcacc 
tgctggctgg etgccttccg cctcttcaag tacggggtgc agctgtacaa gaattaccag 1260 

1380 

CCCCgggagg Ctccgayuju yyuum-yyay yayyv-v-v-ciy y uk^wjj— 3 

caecgcctca gectgeecat gccagcctcc ggcacgagcc teagtgeage catccaccgc 
acccaactct ggttccacgg gegcatttcc cgtgaggaga gccagcggct tattggacag 
cagggcttgg tagaeggect gttcctggtc egggagagtc agcggaaccc ccagggcttt 1620 

1740 
1800 
1860 



gtcctctctt tgtgccacct gcagaaagtg aagcattatc tcatcctgcc gagegaggag 
gagggtcgee tgtacttcag catggatgat ggccagaccc gcttcactga cctgctgcag 
ctcgtggagt tccaccagct gaaccgcggc atcctgccgt gettgetgeg ccattgctgc 
acgcgggtgg ccctctgacc aggccgtgga ctggctcatg cctcagcccg ccttcaggct 
gcccgccgcc cctccaccca tccagtggac tetggggege ggccacaggg gaegggatga 1920 
— - — - — 1980 



ggagegggag ggttccgcca ctccagtttt ctcctctgct tetttgecte cctcagatag 

aaaacagccc ccactccagt ccactcctga cccctctcct caagggaagg ccttgggtgg 2040 

ccccctctcc ttctcctagc tctggaggtg ctgetctagg gcagggaatt atgggagaag 2100 

tgggggcagc ecaggeggtt tcacgcccca cactttgtac agaccgagag gecagttgat 2160 

ctgctctgtt ttatactagt gacaataaag attatttttt gatac 2205 

<210> 12 

<211> 2177 

<212> DNA 

<213> Homo sapiens 



<400> 12 

gaattcgegg ccgctggttt gcagctgctc cgtcatcgtg cggcccgacg ctatctcgcg 
ctcgtgtgca ggcccggctc ggctcctggt ccccggtgcg agggttaacg cgaggccccg 120 
gcctcggtcc ceggactagg ccgtgacccc gggtgccatg aagcaggagg gctcggcgcg 1Dn 
gcgccgcggc geggacaagg cgaaaccgcc gcccggcgga ggagaacaag aacccccacc 
gccgccggcc ccccaggatg tggagatgaa agaggaggca gcgacgggtg gegggtcaac 
gggggaggca gaeggcaaga cggcggcggc ageggttgag cactcccagc gagagctgga 



180 
240 
300 
360 
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cacagtcacc ttggaggaca tcaaggagca cgtgaaacag ctagagaaag cggtttcagg 
caaggagccg agattcgtgc tgogggccct gcggatgctg ccttccacat cacgccgcct 
caaccactat gttctgtata aggctgtgoa gggcttottc acttcaaata atgccactcg 
agactttttg ctcoacttcc tggaagagcc eatggacaca gaggctgatt taoagttcog 
tccccgoacg ggaaaagctg cgtcgacacc cetcctgcet gaagtggaag cctatctcca 
actcctcgtg gtcatcttca tgatgaacag caagcgctac aaagaggcac agaagatctc 
tgatgatctg atgcagaaga tcagtactca gaaccgccgg gccctagacc ttgtagccge 
aaagtgttac tattatcaog cccgggtcta tgagttcctg gacaagctgg atgtggtgcg 
cagcttcttg catgctcggo tocggacagc tacgcttcgg catgaogcag aogggcaggo 
caccctgttg aacotoctgc tgcggaatta ootacaotac agcttgtacg accaggctga 
gaagctggtg tccaagtctg tgttcccaga geaggccaac aacaatgagt gggcoaggta 
cctctactao acagggcgaa tcaaagccat ecagctggag tactcagagg cccggagaac 
gatgaccaac gccottcgoa aggccoctca gcacacagot gtcggettca aacagacggt 
gcacaagctt ctcatcgtgg tggagctgtt gctgggggag atcoctgacc ggotgcagtt 



420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 



1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 



2160 
2177 



ccgccagccc tccotoaagc gotoaetoat gccctatttc cttctgactc aagctgtcag 1260 
gacaggaaac ctagccaagt tcaaccaggt cctggatcag tttggggaga agttteaage 15 " 
agatgggacc tacaccctaa ttateeggct gcggeacaao gtgattaaga oaggtgtacg 
catgatcagc ctetoetatt cccgaatctc cttggctgac atcgcccaga agotgcagtt 
ggatagcccc gaagatgcag agttcattgt tgccaaggcc atccgggatg gtgtcattga 
ggccagcato aaccaagaga agggctatgt oeaatccaag gagatgattg acatctatto 
cacccgagag ccooagctag ccttocacca gcgcatctco ttctgootag atatccacaa 
catgtctgtc aaggccatga ggtttcctcc caaatcgtac aacaaggact tggagtctgc 
agaggaacgg cgtgagcgag aaeagcagga ottggagttt gccaaggaga tggcagaaga 
tgatgatgac agottooott gagctggggg gctggggagg ggtaggggga atggggacag 
gGtctttecc ccttgggggt cccctgccca gggeactgtc cceattttco cacacacagc 
tcatatgctg cattcgtgca gggggtgggg gtgctgggag ccagccacco tgacctcccc 
cagggctcct cccoagocgg tgacttaetg tacagcaggo aggagggtgg gcaggcaacc 
tccccgggca gggtcctggc cagcagtgtg ggagcaggag gggaaggata gttctgtgta 
ctcctttagg gagtggggga ctagaaotgg gatgtcttgg cttgtatgtt ttttgaagct 2100 
togattatga tttttaaaca ataaaaagtt ct< 
aaagcggccg cgaattc 

<210> 13 

<211> 2960 

<212> DNA 

<213> Homo sapiens 



<400> 13 

ctgccgcttc caggogtcta tcagoggctc agcotttgtt cagctgttct gttcaaacac 
tctggggcca ttcaggcctg ggtggggcag cgggaggaag ggagtttgag gggggcaagg 
cgacgtoaaa ggaggatcag agattccaca atttcacaaa aotttogoaa acagcttttt 
gttccaaccc ccctgcattg tcttggacac caaatttgca taaatcctgg gaagttatta 
ctaagcctta gtcgtggccc caggtaattt cctcccaggc ctccatgggg ttatgtataa 300 

- -— -- - ----- — 360 

420 
480 
540 
600 
660 
720 
780 
840 
900 
960 



120 
180 
240 



agggccccot agagctgggc cccaaaacag ccoggagcot gcagcccagc ccoacocaga 
cccatggotg gacctgccao ccagagcccc atgaagctga tgggtgagtg tcttggccca 
ggatgggaga gccgootgec ctggcatggg agggaggctg gtgtgacaga ggggctgggg 
atccccgttc tgggaatggg gattaaaggc acccagtgtc cccgagaggg cctaaggtgg 
tagggaacag oatgtctcct gagcocgctc tgtccccagc cctgcagctg ctgctgtggc 
acagtgcact ctggaoagtg caggaagcca cccccctggg ccctgccago tcoctgccco 
agagcttcct gctcaagtgc ttagagcaag tgaggaagat ocagggogat ggcgcagcgo 
tccaggagaa gctggtgagt gaggtgggtg agagggctgt ggagggaagc ccggtgggga 
gagctaaggg ggatggaact gcagggccaa catcctctgg aagggacatg ggagaatatt 
aggagcagtg gagctgggga aggctgggaa gggacttggg gaggaggacc ttggtgggga 
cagtgctcgg gagggctggc tgggatggga gtggaggcat cacattcagg agaaagggca 



91 
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agggccectg tgagatcaga gagtgggggt 
aggacatgga gggaggggaa agaccagaga 
ooggccaogg ogagtctoac tcagcatcct 

5 gccaccccga ggagctggtg ctgctcggac 

gcagctgcce cagccaggcc ctgoagctgg 
gagggggaag gagaggagga acacccatgg 
gggoctgacg tatctcaggc agcaccccet 
gcttgagoca actccatago ggcottttcc 

10 ggatcteccc cgagttgggt cccaccttgg 

ccaccaccat ctggcagcag gtgagccttg 
ttctgggoac cacagcoggg cctgtgtatg 
ttcctcattt gtaataaegc coaoteagaa 
acagatggaa gaactgggaa tggcccctgc 
cttegcctct gctttccagc gccgggcagg 
ottcctggag gtgtcgtacc gogttctacg 
oocatcocat gtatttatct ctatttaata 
aoagggaaga gcagaaogga gceccaggcc 
attetcctgc ctgtagcagt gagaaaaagc 
tagataggta aataccaagt atttattact 

20 gggcactggg atgagccgct gtgagcocct 

agagtatcag gtctcccacg tgggagacaa 
gttocceate tgggtccttg cacccctcac 
tgcatcccct tggctgtgag gcccctggac 
gccctggggt cccacgaatt tgctggggaa 

25 atggtttgac tcocgaaoat eaccgacgtg 

catgcootgc ceoeaegagg gtcaggactg 
gacatttgcc ttgctggatg gggactgggg 
gtoaggcctg tgtgtgaaag gaagctccac 
accagtgtcc cctccactgt oacattgtaa 

3Q ccagtcacgt ccttcctcct tcttgagtcc 

30 ggctgaaggg tgggagaggc cagagggagg 

ggaggaggag gaaagttctc aagttcgtct 
gagcacctac tctgtgcaga cgctgggota 
gacatggaat atgoactcga 



gcagggcaga gaggaactga acagootggc 1020 

gtoggggagg acccgggaag gagcggcgac 1080 

tccatcccca gtgtgccacc tacaagctgt 1140 

actctctggg catcccctgg gctcccctga 1200 

tgagtgtcag gaaaggataa ggctaatgag 1260 

gctcocceat gtctccaggt tccaagctgg 1320 

aaotettceg etotgtetca caggoaggct 1380 

tctaccaggg gctcctgcag gccctggaag 1440 

acacactgca gctggaogtc gcogactttg 1500 

ttgggcaggg tggoeaaggt cgtgctggca 1560 

ggooctgtcc atgctgtcag cccccagcat 1620 

gggcceaacc actgatoaca gctttccccc 1680 

cctgcagccc acocagggtg ccatgccggc 1740 

aggggtcctg gttgcctccc atotgcagag 1800 

ccaccttgcc cagccctgag ecaagccctc 1860 

tttatgtcta tttaagcctc atatttaaag 1920 

tctgtgtcct tccctgoatt tctgagtttc 1980 

tcctgtcctc ccatcccctg gactgggagg 2040 

atgactgctc eccagcoctg getotgcaat 2100 

ggtootgagg gtecceaeot gggaccottg 2160 

gaaatecctg tttaatattt aaacagcagt 2220 

tctggcctca gcogactgca cagcggcccc 2280 

aagcagaggt ggocagagot gggaggcatg 2340 

totcgttttt cttcttaaga ettttgggac 2400 

tctectgttt ttctgggtgg cctcgggaca 2460 

tgactctttt tagggocagg caggtgcctg 2520 

atgtgggagg gagcagacag gaggaatcat 2580 

tgtcacccte cacctcttca ccocccactc 2640 

ctgaacttca ggataataaa gtgtttgcct 2700 

agctggtgcc tggccagggg ctggggaggt 2760 

tcggggagga ggtctgggga ggaggtooag 2820 

gacattcatt ccgttagcao atatttatct 2880 

agtgctgggg acacagcagg gaacaaggca 2940 
2960 



35 <210> 14 



<211> 850 



<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<222> (3) . . (4) 

<223> n=a, c, g or t 



55 
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<220> 

<221> miso_feature 

<222> (9) . . (9) 

<223> n=a, o, g or t 

<220> 

<221> misc_feature 

<222> (11) . . (ID 

<223> n=a, a, g or t 

<220> 

<221> misc_feature 

<222> (18) . . (18) 

<223> n=a, c, g or t 

<220> 

<221> misc_feature 

<222> (202) . . (202) 

<223> n=a, c, g or t 

<220> 

<221> misc_feature 
<222> (205) . . (205) 

<223> n=a r c, g or t 

<220> 

<221> misc_feature 
<222> (273) . . (273) 

<223> n=a, c, g or t 



55 
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<220> 

<221> misc_feature 

<222> (327) . . (327) 

<223> n=a, c, g or t 

<220> 

<221> misc_feature 

<222> (367) . . (367) 

<223> n=a, c, g or t 

<220> 

<221> misc_feature 

<222> (581) . . (581) 

<223> n=a, c, g or t 

<220> 

<221> miscfeatura 

<222> (599) . . (599) 

<223> n=a, c, g or t 

<220> 

<221> misc_feature 

<222> (628) . . (628) 

<223> n=a, c, g or t 

<220> 

<221> misc_featura 

<222> (673) . . (673) 

<223> n=a, c, g or t 
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<220> 

<221> misc_feature 

<222> (675) . . (675) 

<223> n=a, c, g or t 

<220> 

<221> misc_feature 

<222> (682) . . (682) 

<223> n=a, c, g or t 

<220> 

<221> misc_feature 

<222> (693) . . (693) 

<223> n=a, c, g or t 

<220> 

<221> miscjeaturs 

<222> (698) . . (698) 

<223> n=a, c, g or t 

<220> 

<221> misc_feature 

<222> (700) . . (700) 

<223> n=a, c, g or t 

<220> 

<221> misc_feature 

<222> (720) . . (720) 

<223> n=a, c, g or t 



55 
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<220> 

<221> misc_feature 

<222> (730) . . (730) 

<223> n=a, c, g or t 

<220> 

<221> misc_feature 

<222> (734) . . (734) 

<223> n=a, c, g or t 

<220> 

<221> miso_feature 

<222> (742) . . (743) 

<223> n=a, c, g or t 

<220> 

<221> misc_feature 

<222> (746) . . (746) 

<223> n=a, c, g or t 

<220> 

<221> miso_feature 

<222> (748) . . (748) 

<223> n=a, c, g or t 

<220> 

<221> misc_feature 

<222> (752) . . (752) 

<223> n=a, c, g or t 



55 
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<220> 

<221> misc_feature 

<222> (762) . . (762) 

<223> n=a, e, g or t 

<220> 

<221> misc_feature 

<222> (767) . . (767) 

<223> n=a, c, g or t 

<220> 

<221> misc_featura 

<222> (777) . . (777) 

<223> n=a, c, g or t 

<220> 

<221> miso_featura 

<222> (783) . . (784) 

<223> n=a, c, g or t 

<220> 

<221> misc_feature 

<222> (789) . . (789) 

<223> n=a, c, g or t 

<220> 

<221> misc_feature 

<222> (794) . . (794) 

<223> n=a, c, g or t 
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<220> 
<221> 

5 

<222> 
<223> 

10 <220> 
<221> 
<222> 

15 

<223> 
<220> 

20 

<221> 
<222> 
<223> 

25 

<220> 
<221> 
<222> 
<223> 

35 

<220> 
<221> 

40 <222> 
<223> 

<220> 
<221> 
<222> 



m±sc_feature 
(797) . . (798) 
n=a, c, g or t 

misc feature 
(803) . . (805) 
n=a, c, g or t 

misc_feature 
(810) . . (810) 
n=a, c, g or t 

misc_featura 
(817) . . (817) 
n-a, o, g or t 

misc_feature 
(826) . . (827) 
n=a, c, g or t 

misc_feature 
(831) . . (832) 
n=a, c, g or t 



55 
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<220> 

<221> misc_feature 

<222> (834).. (834) 

<223> n=a, c, g or t 

<220> 

<221> misc_feature 

<222> (837).. (838) 

<223> n=a, c, g or t 

<220> 

<221> misc_featura 

<222> (840).. (840) 

<223> n=a, c, g or t 

<220> 

<221> misc_feature 

<222> (844) . . (844) 

<223> n-a, c, g or t 

<220> 

<221> misc_feature 

<222> (846).. (848) 

<223> n=a, c, g or t 

<400> 14 



actggtgaag gtgtcagoca tgtctagccc canggtggtt 



ttccacacgg 


acaacatgcg 


60 


atctcctcca 


tcctggggtc 


120 


atggaggagt 


gtgtggactg 


180 


cccttcacca 


ccgtgtcgga 


240 


ctggccatca 


cggacctcag 


300 


gcactctgag 


gggcttggca 


360 


cctccaaggg 


aagcagtgag 


420 


goaagagcot 


otagcggctt 


480 


cggtttcttc 


ttttcaaaat 


540 


nctggaccac 


aagcccagng 


600 


ttacctcttg 


aggaactttc 


660 


gtgtttttcg 


gggttttttn 


720 


cnttagagct 


ccccggngga 


780 
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aanntcttna tccnctnnot ttnnnctccn tcacctncct totttnntct nntnttnncn 
tocncnnnoc 

5 <210> 15 

<211> 2309 

<212> DKA 

10 

<213> Homo sapiens 



60 
120 
180 
240 
300 
360 



480 
540 
600 
660 



<400> 15 

ceecgggcgc aggaggcggg cggcccggcc ccaccggccc cccatggacg cocccagcac 
ggggcgctga gacccccgcg tcgctgccca gcccggtccg gogogocacg coagggatct 
otggacagga oaagactccg aagctactcc cccageacac agccegggac ccacaaaccc 
agcttgcccc cagccctccc acctgeoact coctggccec tccoaccgce cgcccccctt 
ggggcgcagg gcatggtgtg aaaggccaag fcgctgaggeg ggtatcatgg gtgctgtgcc 
ctagggcotg ggtggcaggg ggtgggtgge ctgtgggtgt gccggggggg ccagtgtgcc 
caccooagtc tottggegtg etggagggca tcctggatgg aattgaagtg aatggaacag 420 
aagecaagca aggtggagtg tgggtcagac ccagaggaga acagtgcoag gtcacoagat " ori 
ggaaagcgaa aaagaaagaa cggccaatgt tccctgaaaa ccagcatgtc agggtatatc 
cetagttacc tggacaaaga cgagcagtgt gtcgtgtgtg gggacaaggc aaetggttat 
caotaccgct gtatcacttg tgagggctgo aagggottct ttcgecgoac aatccagaag 
aacctcoato ccacctattc ctgcaaatat gacagctgct gtgtcattga oaagatoaco 720 

780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 



cgoaateagt gceagctgtg cegcttcaag aagtgoatcg ccgtgggcat ggccatggac 
ttggttctag atgactcgaa gcgggtggcc aagcgtaagc tgattgagca gaaccgggag 
cggcggcgga aggaggagat gatccgatca ctgcagcagc gaccagagcc cactcctgaa 
gagtgggatc tgatccacat tgccacagag gcccatcgca gcacoaatgc ccagggcago 
cattggaaac agaggcggaa attcctgccc gatgacattg gacagtcacc cattgtctcc 
atgccggacg gagacaaggt ggacctggaa gocttoagcg agtttaccaa gatcatcaco 
ccggccatca cccgtgtggt ggactttgcc aaaaaactgc ccatgttcfcc cgagctgcct 
tgcgaagacc agatcatcct cctgaagggg tgctgcatgg agatcatgtc cctgcgggcg 
getgtccgct acgaocctga gagcgacacc ctgacgctga gtggggagat ggctgtcaag 
cgggagcagc tcaagaatgg cggcctgggc gtagtctccg acgccatctt tgaactgggc 
aagtoactct ctgcctttaa cctggatgac acggaagtgg ctctgctgca ggctgtgctg 
ctaatgtcaa cagaccgctc gggcctgctg tgtgtggaca agatcgagaa gagtcaggag 
gcgtacctgc tggcgttcga gcactacgtc aaccaccgca aaoacaaoat tcegcacttc 
tggcccaagc tgctgatgaa ggagagagaa gtgcagagtt cgattctgta caagggggca 
gcggcagaag gccggccggg cgggtcactg ggcgtocacc cggaaggaoa gcagcttctc 
ggaatgcatg ttgttcaggg tccgoaggtc cggoagcttg agoagcagct tggtgaagcg 1680 
ggaagtctcc aagggccggt tcttcagcac cagagcccga agagcccgca gcagcgtctc n " n 
ctggagctgc tccaccgaag cggaattctc catgcccgag cggtctgtgg ggaagacgac 
agcagtgagg cggactcccc gagctcctct gaggaggaac cggaggtctg cgaggacctg 
gcaggcaatg cagcctctcc ctgaagcccc ccagaaggcc gatggggaag gagaaggagt 1920 
gcoatacctt ctcccaggcc tctgccccaa gagcaggagg tgootgaaag ctgggagcgt 1980 
gggctcagca gggctggtca cctcccatcc cgtaagacca ccttcocttc otcagcaggc 
caaacatggc cagactocot tgctttttgc tgtgtagttc cctotgcctg ggatgccctt 
ccccctttct otgcctggca acatcttact tgtoctttga ggccccaact caagtgtcac 
otccttcccc agctococoa ggcagaaata gttgtctgtg cttccttggt tcatgcttct 
actgtgacac ttatctcact gttttataat tagtcgggca tgagtctgtt tcccaagcta 
gactgtgtct gaatcatgte tgtatcccg 



1740 
1800 
1860 



2040 
2100 
2160 
2220 
2280 
2309 
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100 



EP 1 365 034 A2 



<210> 16 

<211> 2355 

<212> DNA 

<213> Homo sapiens 



<400> 16 

ccgttgcctc aacgtccaac ccttctgcag ggotgcagtc cggccacccc aagaccttgc 
tgoagggtgc ttcggatcct gatcgtgagt cgcggggtce actccccgcc cttagccagt 
gcccaggggg oaacagoggo gatcgcaacc tetagtttga gtcaaggtcc agtttgaatg 
accgctctca gctggtgaag aeatgaccac cctggactcc aacaacaaca caggtggcgt 240 
catcacctac attggeteca gtggctcctc cccaagccgc accagccctg aatocctcta 300 
tagtgacaac tccaatggca gcttccagtc cctgacccaa ggctgtccca cotacttccc 360 
acoatccccc actggctccc tcacocaaga cccggctogc toctttggga gcatteeacc 420 
oagcctgagt gatgacggot ccccttctte etcatcttcc togtogtcat cetccteetc 480 
ottctataat gggagcccoc ctgggagtct acaagtggcc atggaggaea geagcogagt 540 
gtococcagc aagageacca gcaaaatoac oaagotgaat ggcatggtgt tactgtgtaa 600 
agtgtgtggg gacgttgcot egggetteea etacggtgtg ctcgcctgcg agggctgcaa 660 
gggctttttc cgtcggagca tceagcagaa catccagtac aaaaggtgtc tgaagaatga 720 
gaattgctcc atcgtecgca toaatcgcaa ccgctgccag caatgtcgct tcaagaagtg 
tctctctgtg ggcatgtctc gagacgctgt gcgttttggg cgcatcccca aacgagagaa 
gcagcggatg cttgctgaga tgcagagtgc catgaacctg gccaaeaacc agttgagcag 
ccagtgcccg ctggagaott cacccaccca goaccccacc coaggoccca tgggeocctc 
gccaccccct gctccggtcc eotcacccct ggtgggcttc tcccagtttc cacaacagct 
gacgcctocc agatccccaa gccctgagcc cacagtggag gatgtgatat cccaggtggc 1080 

1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 



120 
180 



780 
840 
900 
960 
1020 



ccgggcccat cgagagatct tcacctacgc ccatgacaag ctgggcagct cacctggcaa 
cttcaatgcc aaccatgcat caggtagcoc tccagccacc accccacatc gctgggaaaa 
tcagggctgc ccacctgccc ccaatgacaa caaoaccttg gotgcccagc gtcataacga 
ggccctaaat ggtctgcgcc aggctccctc ctcctaccct ccoacctggc ctcctggcec 
tgcacaccac agctgccaoc agtccaacag caacgggcao cgtctatgcc coaccoacgt 
gtatgcagcc ccagaaggca aggcacctgc caacagtocc cggcagggca actcaaagaa 
tgttotgctg gcatgtccta tgaacatgta cocgoatgga cgcagtgggc gaacggtgca 
ggagatctgg gaggatttot coatgagctt cacgcocgct gtgcgggagg tggtagagtt 
tgccaaacac atocogggot tccgtgacct ttctcagcat gaccaagtca ccctgcttaa 
ggotggoacc tttgaggtge tgatggtgcg ctttgcttcg ttgttcaaog tgaaggacca 1680 
gacagtgatg ttcctaagcc ggaccaccta cagcctgcag gagcttggtg ccatgggcat 1740 
gggagacctg ctcagtgcca tgttcgactt cagogagaag ctcaaotcco tggcgottao 1800 
cgaggaggag ctgggcctct tcaccgcggt ggtgcttgtc tctgoagaco gctcgggcat 1860 
ggagaattcc gottcggtgg agcagctcca ggagacgctg ctgcgggotc ttcgggctct 1920 
ggtgctgaag aaccggccct tggagacttc oogcttcacc aagctgctgo tcaagctgcc 
ggacctgogg accctgaaca aoatgcattc cgagaagctg ctgtccttcc gggtggaogc 
ccagtgaccc gcccggcegg cottctgccg ctgccccott gtacagaatc gaactctgca 
cttctctctc ctttacgaga cgaaaaggaa aagcaaacca gaatcttatt tatattgtta 2160 
taaaatattc caagatgagc ctctggcccc ctgagocttc ttgtaaatao ctgcctcoct 2220 
oocccatcac cgaacttccc ctcctcccct atttaaacca ctctgtctco cccacaaccc 
tococtggcc ctctgatttg ttctgttcct gtctcaaatc caatagttca cagctaaaaa 



1980 
2040 
2100 



2280 
2340 
2355 



55 
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<210> 17 

<211> 4119 

<212> DNA 

<213> Homo sapiens 



<400> 17 

gaattccgtt gctgtogcao aeaoacacac 
acaccccaac acacacacac acacacacac 
atggccgagc gccgcaegcg tagcacgccg 

15 tgcgacgggc gcggtgcgta agtacctcgc 

acaggcggcg gcagcgcget togcaagaea 
ccgacagcgg cggctooeeg ttgcggggag 
gcggcagcgg ctototgcct tcacagcgcg 
gggtggagag cgggggcgcc aagagtgctg 

2 0 ttgaaggtga tgetgttctc tcggattatg 

gtgaatacag tgaagaggaa aactccaaag 
ttaattcttc aaoaaaagaa gagaagggag 
ctggagagag gcaaagtggg gacggaoagg 
gtaaaaaggg ccctaagoat ttggatgatg 
ctcggaaagg gctcttettt gagcatgatc 

25 gacccaaggg gcgtcagcga aagctatgga 

tccgggaaga tgagcaggcc ccaaagtcec 
acattcgctc agetcataat cctgatgaea 
atgggagtcc tccacaaaga gatccaaact 
gccaccaggg tcttgggggc accctaccac 

30 gtaccggccg tatgtctgca cccaggaatt 

gtgctggttt taggcctgtg gaagctggtg 
ttaagcatga gattagttac oggtcacggc 
ctccagaagc agatgctcca gtgcttggca 
caccagctgc tgctcctgat gotgcaccac 
cctattcccg ggcaagaaga actcgaacca 

35 aggtgccccc tcctcctgaa ggactgattc 

ctcoaoctac taagactggg acctgggaag 
agcaagatgt ggcacaacta aatatagcag 
toctgoaacc acgggaactt cgaggtatgc 
cacctcagtt taaccggatg gaagaaatgg 

40 catcccagcg gcaaagacct gtgccagagc 

tggagggaca ttactatgat ccactgcagt 
gccctgcccc gctgcctcca cagggcatgc 
caggtttaca tccccaccag acaccagctc 
cagtgtccat gtctccagga cagccaccac 
otgctocagg cgtoatgaac tttggtaatc 

45 ctoooccacc accgcctcat ctgtatocta 

gagtgaccta ctataacccc gcccagcagc 
ggactcccca gccagtoacc atcaagcccc 
gttaatacaa gtttctgaat attttaaatc 
gaactcagaa gagaaataca gctggctatc 

50 tgtggctcct accagcaaac agctgaaaga 

ctctagagag agggagaaac aagtggacot 
ctgtgttcgg gggagcagag agagccagac 
gccoatgtct tctgctgttc ttcacttctg 
agctccttcc tgtttgtttt gttttctaag 
cttaatatta ttttaatttt ttctctttgt 

55 aaatgaaaca agtctagtct tctggttttc 



acacacacac acaccccaac acacacacac 60 

acacacacac acacacacac acacagcggg 120 

ggactagcta tccagectcc cagcagcctc 180 

cggtggtggc cgttctccgt aagatggcgg 240 

ccgaggacga ggaatctggt gcttcgggct 300 

gcgggagctg cagcggtagc gccggaggcg 360 

gaggccgaac cggggccctt catetgcggc 420 

aggagtcgga gtgtgagagt gaagatggca 480 

aaagtgcaga agactcggaa ggtgaagaag 540 

tggagctgaa atcagaagct aatgatgctg 600 

aagaaaagcc tgacaccaaa agcaetgtga 660 

agagcacaga gcctgtggag aacaaagtgg 720 

atgaagatcg gaagaatcca gcatacatac 780 

ttcgagggca aactcaggag gaggaagtca 840 

aggatgaggg tcgctgggag catgacaagt 900 

gacaggagct cattgctctt tatggttatg 960 

tcaaacctcg aagaatccgg aaaccccgat 1020 

ggaacggtga gcggctaaac aagtctcatc 1080 

caaggacatt tattaacagg aatgctgcag 1140 

attctcgatc tgggggcttc aaggaaggtc 1200 

ggcagcatgg tggccggtct ggtgagactg 1260 

gcctagagca gacttctgtg agggatccat 1320 

gtcctgagaa ggaagaggca gcctcagagc 1380 

caccccctga taggcccatt gagaagaaat 1440 

aagttggaga tgcagtcaag cttgcagagg 1500 

cagcacctcc agtcccagaa accaccccaa 1560 

ctccggtgga ttctagtaca agtggacttg 1620 

aacagaattg gagtccgggg cagccttctt 1680 

ccaaocatat acacatggga gcaggacctc 1740 

gtgtccaggg tggtcgagcc aaacgctatt 1800 

cccccgcccc tccagtgcat atcagtatca 1860 

tccagggacc aatctatacc catggtgaca 1920 

ttgtgcagcc aggaatgaac cttccccacc 1980 

ctctgcccaa tccaggcctc tatcccccac 2040 

ctcagcagtt gcttgctcct acttactttt 2100 

ccagttaccc ttatgctcca ggggcactgc 2160 

atacacaggc cccatcacag gtatatggag 2220 

aggtgcagcc aaagccctcc ccaccccgga 2280 

ctccacctga ggttgtaagc aggggttcca 2340 

ttaacatcat ataaaaagca gcagaggtga 2400 

tactaccaga agggcttcaa agatataggg 2460 

ggaggacccc tgccttcctc tgaggacagg 2520 

cgtcccatct tcactcttca cttgagttgg 2580 

agccccaagc ttctgagtct agatacagaa 2640 

ggaaattgaa gtgtcttctg ttcccaagga 2700 

atgttcattt ttaaagcctg gcttcttatc 2760 

ttctgtttct tgctctctct ccctgccttt 2820 

tagcccctct ggattccctt ttgactcttc 2880 
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2940 



cgtgcatcoc agataatgga gaatgtatoa gecagccttc cccaccaagt ctaaaaagac 
ctggcctttc aottttagtt ggcatttgtt atcctcttgt atacttgtat tcccttaact 3000 

3060 
3120 
3180 
3240 
3300 
3360 



4119 



ctaaccctgt ggaagcatgg ctgtctgcac agagggtecc attgtgcaga aaagctcaga 
gtaggtgggt aggagccett ctetttgact taggttttta ggagtctgag catccatcaa 
tacctgtaot atgatgggct tctgttetct gctgagggcc aataccctao tgtggggaga 
gatggcacac cagatgcttt tgtgagaaag ggatggtgga gtgagagcct ttgcctttag 
gggtgtgtat tcacatagtc cteagggcto agtcttttga ggtaagtgga attagagggc 
cttgcttctc ttctttccat tettcttgct acacoocttt tccagttgct gtggaccaat 
gcatctcttt aaaggcaaat attatccagc aagcagtcta ccctgtcctt tgcaattgct 3420 
cttctccacg tctttoctgc tacaagtgtt ttagatgtta ctaccttatt ttocccgaat 3480 
tctatttttg tccttgcaga cagaatataa aaactcctgg gettaaggco taaggaagcc 3540 
agtcaccttc tgggcaaggg ctcctatctt tcctooctat ccatggcact aaaccacttc 3600 
tctgctgcot ctgtggaaga gattcctatt actgcagtac atacgtctgc caggggtaac 3660 
otggccaotg tocctgtcct tctacagaac ctgagggcaa agatggtggc tgtgtototc 3720 
cccggtaatg tcactgtttt tattccttcc atctageagc tggoctaatc actctgagtc 3780 
acaggtgtgg gatggagagt ggggagaggc aettaatotg taaococcaa ggaggaaata 3840 
actaagagat tottctaggg gtagctggtg gttgtgcctt ttgtaggctg ttccctttgc 3900 
cttaaacctg aagatgtctc ctcaagcctg tgggcagcat gcccagattc ccagacctta 3960 
agacaotgtg agagttgtct ctgttggtcc actgtgttta gttgcsaagga tttttccatg 4020 
tgtggtggtg ttttttgtta ctgttttaaa gggtgcsccat ttgtgatcag cattgtgaot 4080 
tggagataat aaaatttaga ctataaaott gaaaaaaaa 

<210> 18 

<211> 2653 

<212> DHA 

<213> Homo sapiens 



<400> 18 

gagcgcggct ggagtttgct gctgccgctg tgoagtttgt tcaggggctt gtggcggtga 60 

gtccgagagg ctgcgtgtga gagacgtgag aaggatcctg cactgaggag gtggaaagaa 120 

gaggattgct cgaggaggcc tggggtctgt gagacagcgg agotgggtga aggctgcggg 180 

ttcoggcgag gcctgagctg tgctgtogtc atgoctoaaa cccgatccca ggcacaggct 240 

acaatcagtt ttccaaaaag gaagctgtct cgggcattga acaaagctaa aaactccagt 300 

gatgccaaac tagaaccaac aaatgtccaa accgtaacct gttotcctcg tgtaaaagcc 360 

otgoototca gccccaggaa acgtctgggc gatgaoaacc tatgoaacao tocccattta 420 

cctccttgtt ctccaccaaa gcaaggcaag aaagagaatg gtccccctca otoacataca 480 

cttaagggac gaagattggt atttgacaat cagctgacaa ttaagtctcc tagoaaaaga 540 

gaactagoca aagttcacca aaacaaaata ctttcttcag ttagaaaaag tcaagagatc 600 

acaacaaatt ctgagoagag atgtccaotg aagaaagaat ctgcatgtgt gagactattc 660 
aagcaagaag gcacttgota ccagcaagca aagotggtco tgaaoacagc tgtcccagat 
cggctgcctg ccagggaaag ggagatggat gtcatcagga atttcttgag ggaacacatc 
tgtgggaaaa aagctggaag cctttacctt tctggtgoto ctggaactgg aaaaactgcc 
tgcttaagoo ggattotgca agacctcaag aaggaactga aaggctttaa aactatoatg 



720 
780 
840 
900 



ctgaattgca tgtccttgag gactgcccag gctgtattco cagctattgc toaggagatt 960 
tgtcaggaag aggtatccag gocagctggg aaggacatga tgaggaaatt ggaaaaacat 
atgactgcag agaagggccc oatgattgtg ttggtattgg acgagatgga tcaactggac 
agcaaaggcc aggatgtatt gtacacgcta tttgaatggo catggctaag caattctcac 
ttggtgctga ttggtattgc taataccotg gatctcacag atagaattct aoctaggott 1200 
caagctagag aaaaatgtaa gcoacagctg ttgaacttco caocttatac cagaaatcag 
atagtcacta ttttgcaaga tcgacttaat caggtatcta gagatcaggt tctggacaat 
gctgcagtto aattotgtgc ccgcaaagtc tctgctgttt caggagatgt togcaaagca 1380 
otggatgttt gcaggagagc tattgaaatt gtagagtcag atgtcaaaag ccagactatt 1440 
ctcaaaccac tgtotgaatg taaatcacct tctgagccto tgattaocaa gagggttggt 1500 
cttattcaca tatcccaagt oatctcagaa gttgatggta acaggatgao cttgagccaa 1560 



1020 
1080 
1140 



1260 
1320 
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gagggagcac aagattcott occtotteag cagaagatct tggtttgctc tttgatgctc 1620 

ttgatcaggo agttgaaaat caaagaggtc actctgggga agttatatga agcctaoagt 1680 

aaagtctgtc gcaaacagca ggtggcggct gtggaccagt cagagtgttt gtoactttca 1740 

gggctcttgg aagccagggg cattttagga ttaaagagaa acaaggaaac ccgtttgaca 1800 

aaggtgtttt tcaagattga agagaaagaa atagaaeatg ctctgaaaga taaagcttta 1860 

attggaaata tcttagotac tggattgcct taaattottc tcttacaccc cacccgaaag 1920 

tattoagctg gcatttagag agctacagtc ttcattttag tgctttacac attcgggcct 1980 

gaaaacaaat atgacctttt ttacttgaag ccaatgaatt ttaatctata gattctttaa 2040 

tattagcaca gaataatatc tttgggtctt actattttta cccataaaag tgaccaggta 2100 

2220 
2280 
2340 
2400 
2460 
2520 
2580 



gacccttttt aattacattc aotaottcta ccacttgtgt atctctagcc aatgtgcttg 
caagtgtaca gatctgtgta gaggaatgtg tgtatattta cctcttcgtt tgotoaaaca 
tgagtgggta tttttttgtt tgtttttttt gttgttgttg tttttgaggc gcgtctoacc 
ctgttgcoca ggctggagtg oaatggcgcg ttctctgctc actacagcac ccgcttccca 
ggttgaagtg attctettgc otcagcotoc cgagtagctg ggattacagg tgcccaccae 
cgogcocagc taatttttta atttttagta gagacagggt tttaccatgt tggooaggot 
ggtcttgaac tcotgaccot caagtgatct geccaeettg gcctcectaa gtgotgggat 
tataggcgtg agooaecatg ctcagceatt aaggtatttt gttaagaact ttaagtttag 
ggtaagaaga atgaaaatga tocagaaaaa tgcaagcaag tccacatgga gatttggagg 2640 
acactggtta aag 2653 



<211> 2907 
<212> DNA 
<213> Homo sapiens 



<400> 19 

gccatctggg cccaggcccc atgooccgag gaggggtggt ctgaagccca ooagagocoo 60 

ctgccagact gtctgcctcc cttctgactg tggccgcttg gcatggccag caacagcagc 120 

toetgcocga cacctggggg ogggcacotc aatgggtaco cggtgcetoc ctacgccttc 180 

ttettcccco ctatgctggg tggactctco ccgccaggcg ctctgaccac tctccagcac 240 

cagottocag ttagtggata tagcacacca tccccagcca ccattgagac ccagagcagc 300 

420 
480 
540 
600 
660 
720 
780 
840 
900 
960 



agttctgaag agatagtgcc cagccctccc tcgcoacccc ctotaoocog catctacaag 360 
ccttgctttg tctgtoagga caagtcctca ggctacoact atggggtcag cgcctgtgag 
ggctgcaagg gcttcttcog ocgoagcatc cagaagaaca tggtgtacac gtgtcaccgg 
gacaagaact gcatcatcaa caaggtgaoc cggaaccgct gcoagtactg ccgactgcag 
aagtgotttg aagtgggcat gtccaaggag tctgtgagaa acgaoogaaa oaagaagaag 
aaggaggtgo ccaagcccga gtgctctgag agctacacgo tgaogccgga ggtgggggag 
ctcattgaga aggtgogcaa agcgcaccag gaaaocttcc ctgccctctg ccagctgggc 
aaatacacta cgaacaacag ctcagaacaa cgtgtctcto tggacattga cctctgggao 
aagttcagtg aactctccac caagtgcatc attaagactg tggagttcgc caagcagotg 
cccggcttca ccaccctcac catcgccgac cagatcaccc tcctoaaggc tgcctgcctg 
gacatcctga toctgcggat ctgcacgcgg tacacgooog agcaggacac catgaccttc 
tcggacgggc tgaccctgaa ocggaccoag atgcacaacg ctggcttcgg ccccctcacc 1020 
gacctggtot ttgccttcgc caaccagctg ctgcccctgg agatggatga tgcggagacg 
gggctgotca gcgccatctg cctcatctgc ggagaccgcc aggacctgga gcagacggao 
cgggtggaca tgetgeagga gccgctgctg gaggcgctaa aggtotacgt gcggaagcgg 
aggcccagcc gcccccacat gttccccaag atgotaatga agattactga cctgcgaagc 
atcagcgcca agggggctga gcgggtgatc acgctgaaga tggagatcoc gggctccatg 1320 
oogootctca tccaggaaat gttggagaac tcagagggcc tggacactct gagcggacag 
ccggggggtg gggggcggga cgggggtggc ctggcccocc cgocaggcag ctgtagcccc 
agcotcagcc coagctccaa cagaagcagc ccggocaccc actcoccgtg accgcccaog 
ccacatggac acagocctcg ccctccgccc cggcttttct ctgcctttct accgaccatg 1560 
tgaccccgca coagoootgc ccccacctgc cctcccgggc agtactgggg accttcoctg 1620 
ggggacgggg agggaggagg cagcgactcc ttggacagag gcctgggccc tcagtggact 1680 



1080 
1140 
1200 
1260 



1380 
1440 

1500 



104 



EP 1 365 034 A2 



gcctgctccc acagcctggg ctgacgtcag aggccgaggc caggaaatga gtgaggcccc 1740 

tggtcctggg tctcaggatg ggtcctgggg goctcgtgtt catcaagaca cccotctgcc 1800 

cagctcacca catcttcatc accagcaaac gccaggaott ggotccccca tcotcagaac 1860 

5 tcacaagcca ttgctoccca gctggggaac ctcaacctcc ccoctgcctc ggttggtgac 1920 

agagggggtg ggacaggggc ggggggttcc ocetgtaoat accctgccat acoaacocca 1980 

ggtattaatt ctogctggtt ttgtttttat tttaattttt ttgttttgat ttttttaata 2040 

agaattttca ttttaagcac atttatactg aaggaatttg tgctgtgtat tggggggago 2100 

tggatccaga gctggagggg gtgggtccgg gggagggagt ggctcggaag gggcoocoao 2160 

tctcctttca tgtccctgtg cccoccagtt ctcctcctca gcottttcct cctcagtttt 2220 

ctctttaaaa ctgtgaagta ctaactttae aaggcctgcc ttcocctcoc tcccactgga 2280 

gaagccgcca goccctttot ccctotgcct gaccactggg tgtggaoggt gtggggcagc 2340 

cctgaaagga caggctootg gccttggcac ttgcctgcac ccaccatgag gcatggagca 2400 

gggcagagca agggccccgg gacagagttt tcccagacct ggctoctogg oagagctgcc 2460 

tcccgtcagg gcccacatca tctaggctec ccagecceea otgtgaaggg gotggccagg 2520 

15 ggcccgagct gcoeooaoco ccggcctcag ceaocagcao ccccataggg cccccagaca 2580 

ccacacacat gcgogtgcgc acacacacaa aaacacacac actggaoagt agatgggccg 2640 

acaoacactt ggcccgagtt cotccatttc cetggcctgc eococaooco oaaootgtcc 2700 

cacccccgtg ececetcctt accccgcagg acgggoctac aggggggtct cccctcaocc 2760 

ctgeaccccc agctggggga gctggctctg cccogacctc cttcaccagg ggttggggcc 2820 

20 octtacootg gagcccgtgg gtgcacctgt tactgttggg ctttccactg agatctaotg 2880 

gataaagaat aaagttotat ttattct 2907 

<210> 20 

<211> 2096 

25 

<212> DNA 

<213> Homo sapiens 



<221> misc_featura 

<222> (23) . . (23) 

<223> n=a, c, g or t 

<220> 

<221> misc_feature 

<222> (27) . . (27) 

<223> n=a, c, g or t 

<220> 

<221> misc_feature 

<222> (80) . . (80) 

<223> n=a, c, g or t 
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<221> misc_feature 
<222> (120) . . (120) 
<223> n=a, o, g or t 

<400> 20 

agatgtttaa aaatactttg atnctongtt tccacctctc ttaaattgtc tttccctatg 
ttaaatatac agtcatcacn ttgctgaaaa aagttcgcaa tgagaacaat catotaaaan 
tggctgtaac taggtcaggc gcggttgctc atgcctgtaa tcceaceact ttgggaggcc 
gaggcaattg gatcaeotga ggtcaggatt ttgagaceag cttgaccaac atggtggaat 
cccatctcta ctaaaaatac aaaaaattag ccgggtgtgg tggcacaccc ctgtaatccc 
acctactcag gaggctgagg caggaaaatc ccttgaacec aggaggcaaa ggttgeattg 
agccgaaata acaocactgc aetccagcct ggacgataga gtgagacccc atetcaaaaa 
aagagcagot gtgaeaaatg octgtattga attgcaggtc agtcttccao ctccactaoc 
ggtgccaaaa aaagggctgc eccaaaagga actaaaaggg atccagcttt gaattctggt 
gtctctcaaa agcctgatcc tgccaaaacc aagaatcgcc gcaaaaggaa gceatccact 
tctgatgatt ctgactctaa ttttgagaaa attgtttcga aagcagtcao aagcaaggtg 
agtgttgate ctagtoagtc ottttgetgt agafcgfctctg aaaoacgtaa ctaagccatt 
gttcttaaaa atttggcata tctttaagaa aattaaotct catattctgt tagottttac 
tgtacatatt tagttttaac aaagttaaat atgecaotta tttggccaat ggaagagttg 
gccttagatc tgcttettat tacttggtag aaaatagaaa actccttgaa tatagtgtct 
tgatacattt ttttacatta caattatgtt gtcagattta oaatgtgcaa gttacctggg 
ettttctctt ttagaaatcc aagggggaga gtgatgactt ccatatggao tttgaotcag 
ctgtggctcc tcgggcaaaa tctgtacggg caaagaaaoo tataaagtae ctggaagagt 
cagatgaaga tgatotgttt taaaatgtga ggcgattatt ttaagtaatt atcttaccaa 
goccaagact ggttttaaag ttacctgaag ctcttaaott ootcocctct gaatttagtt 
tggggaaggt gtttttagta caagaoatca aagtgaagta aagcccaagt gttctttagc 
tttttataat actgtataaa tagtgaccat ctcatgggca ttgttttctt ctctgctttg 
tctgtgtttt gagtotgott ottttgtctt taaaacctga tttttaagtt cttctgaact 13B0 
gtagaaatag ctatctgatc acttcagcgt aaagcagtgt gtttattaac catccactaa man 
gctaaaacta gagcagtttg atttaaaagt gtcactcttc otccttttct aotttcagta 
gatatgagat agagoataat tatotgtttt atcttagttt tatacataat ttaooatcag 
atagaacttt atggttctag tacagatact ctactacaot cagcotctta tgtgccaagt 
ttttotttaa gcaatgagaa attgctcatg ttcttcatct totcaaatca tcagaggoog 
aagaaaaaca otttggctgt gtctataact tgacacagtc aatagaatga agaaaattag 
agtagttatg tgattatttc agctcttgao ctgtccocto tggctgcctc tgagtctgaa 
tctoccaaag agagaaacca atttctaaga ggactggatt gcagaagact cggggaoaac 
atttgateca agatcttaaa tgttatattg ataaccatgc tcagcaatga gctattagat 
tcattttggg aaatotccat aatttcaatt tgtaaacttt gttaagacct gtotaoattg 
ttatatgtgt gtgacttgag taatgttato aacgtttttg taaatattta ctatgttttt 
ctattagcta aattocaaoa attttgtaot ttaataaaat gttotaaaca ttgaaa 

<210> 21 

<211> 2160 

<212> DNA 

<213> Homo sapiens 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 



1440 

1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2096 



55 



106 
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<400> 21 

agccccctgc ccotcgccgc occccgccgc ctgcctgggc cgggccgagg atgeggcgca 60 

gcgcctcggc ggcoaggctt gctccoctcc ggcacgcctg ctaacttccc ccgctacgtc 120 

cecgttcgoc cgcogggccg ccccgtotoc ccgcggcctc cgggtccggg toctocagga 180 

cggccaggcc gtgccgcogt gtgccatccg ccgctcgccc gcgcgccgcg cgctccocgc 240 

360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 



ctgcgcccag cgocccgcgc ccgogcccca gtcotcgggc ggtccatgct gcccctotgc 
ctcgtggcog ccctgctgct ggccgecggg occgggccga gcctgggcga cgaagccatc 
caetgcccgc cctgctocga ggagaagctg gogcgctgcc gcccccccgt gggatgcgag 
gagctggtgc gagaggcggg ctgcggotgt tgcgccactt gogccctggg cttggggatg 
cectgcgggg tgtacaccco ccgttgeggc tcgggcetgo gctgctaccc gccccgaggg 
gtggagaagc ccctgcacac actgatgcac gggcaaggcg tgtgcatgga gctggcggag 
atcgaggcca tccaggaaag eotgcagccc tctgacaagg acgagggtga ccaccccaac 
aacagcttca gcccctgtag cgcccatgao cgcaggtgco tgoagaagca cttogccaaa 
attogagacc ggagcaccag tgggggcaag atgaaggtca atggggcgcc cogggaggat 
gcccggcctg tgccccaggg cteotgcoag agcgagotgo accgggcgct ggagoggctg 
gccgcttoac agagccgcac ccacgaggac otctaottca tccccatooc caactgcgac 
cgcaacggca acttceaccc caagcagtgt caccoagoto tggatgggca gogtggcaag 
tgctggtgtg tggaccggaa gacgggggtg aagottccgg ggggcotgga gocaaagggg 
gagctggact gecaccagct ggctgaeagc tttcgagagt gaggcctgcc agcaggccag 
ggactcagcg tcccctgcta ctcctgtgct ctggaggctg cagagctgac ccagagtgga 1140 
gtctgagtat gagtcctgtc totgcctgcg geccagaagt ttccotcaaa tgcgagtgtg 1 orin 
oaogtgtgcg tgtgcgtgcg tgtgtgtgtg tttgtgagca tgggtgtgoo cttggggtaa 
geeagagcot ggggtgttct otttggtgtt acacagocca agaggactga gactggoact 
tagcccaaga ggtctgagcc ctggtgtgtt tccagatega tcctggattc actoaotoao 
tcattoottc actcatccag ccacctaaaa acatttactg accatgtact acgtgccagc 
tctagtttte agcattggga ggttttattc tgacttcotc tgattttggc atgtggagac 
actcctataa ggagagttca agcotgtggg agtagaaaaa tctcattocc agagtcagag 
gagaagagac atgtacottg acoatcgtcc ttcctctcaa gctageccag agggtgggag 
cctaaggaag cgtggggtag cagatggagt aatggtcacg aggtccagac coaotcccaa 
agotcagact tgccaggctc cctttctett cttccccagg tccttccttt aggtctggtt 
gttgcaccat otgcttggtt ggctggcagc tgagagccct gctgtgggag agcgaagggg 
gtcaaaggaa gacttgaagc acagagggct agggaggtgg ggtacatttc tctgagcagt 
cagggtggga agaaagaatg oaagagtgga ctgaatgtgc ctaatggaga agaoccacgt 1920 
gctaggggat gaggggcttc ctgggtcctg ttcccctacc ccatttgtgg tcacagccat 1 ftM 
gaagtcaccg ggatgaaoct atccttcoag tggctcgctc cctgtagctc tgcctccctc 
tccatatctc cttcccctac acctccctco ccacacotcc ctactcccct gggcatcttc 
tggcttgact ggatggaagg agaottagga aoctaocagt tggccatgat gtottttott 

<210> 22 

<211> 2215 

<212> DNA 

<213> Homo sapiens 



1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 



1980 
2040 
2100 
2160 



<400> 22 

ctgcagggag ccatgattgc aocactgcac tccagcctgg gcaacagagt gagaccatgt 
ctcaagaaaa aaaaaaaaga aagaaaccac tgctotaggc taaatcccag coagagttgg 
agccacccag otaaactggc ctgttttccc toatttcctt ccccgaaggt atgcctgtgt 180 
caagatgagg tcacggacga ttacatcgga gacaacacca cagtggacta cactttgttc 
gagtctttgt gctccaagaa ggaogtgcgg aactttaaag cctggttcct ccctatcatg 
tactccatoa tttgtttcgt gggoctaotg ggcaatgggc tggtcgtgtt gacctatatc 
tatttcaaga ggctcaagac catgaccgat acctacctgc tcaacctggc ggtggcagac 
atcctcttcc tcctgaccct tcccttctgg gcctacagcg cggccaagtc ctgggtcttc 
ggtgtcoact tttgcaagct catctttgcc atctacaaga tgagcttctt cagtggcatg 
otcctacttc tttgcatcag cattgaccgc tacgtggcca togtocaggc tgtctcagct 



120 



240 
300 
360 
420 
480 
540 
600 
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ggac tagagg gaccccwuu ayijijwi^wj yss'-sn^a" 1 - -j, = 

caggacatcc ccocgocaaa agctgetcag ggaaaagoag ctoteccctc agagtgcaag 
occtgctoca gaagttagct tcaococaat ccoagctaec tcaaooaatg ccgaaaaaga 
cagggctgat aagctaacac cagaeagaca acaetgggaa acagaggcta ttgtooccta 
aaccaaaaac tgaaagtgaa agtccagaaa ctgttcccac ctgctggagt gaaggggooa 
aggagggtga gtgcaagggg cgtgggagtg gcctgaagag tcctotgaat gaaocttctg 
gcctcccaca gactcaaatg ctcagaccag ctcttccgaa aaccaggect tatctccaag 
accagagata gtggggagac ttottggctt ggtgaggaaa agcggacato agotggtcaa 
acaaactctc tgaacccctc cctccatcgt tttcttcact gtcetocaag eeagcgggaa 
tggcagctgc cacgcogccc taaaagcaca ctcatococt cacttgccgc gtcgccctco 
caggctctoa acaggggaga gtgtggtgtt tectgcagge caggccagct gcctccgcgt 
gatcaaagce acactctggg ctocagagtg gggatgacat gcactcagct cttggctcca 
otgggatggg aggagaggac aagggaaatg tcaggggcgg ggagggtgao agtggccgcc 
eaaggocaog agottgttot ttgttctttg tcacagggac tgaaaaccto tcctcatgtt 
otgctttcga ttegttaaga gagcaacatt ttacecacac aoagataaag ttttoccttg 
aggaaacaac agctttaaaa gaaaaaagaa aaaaaaaget tggtaagtca agtag 



<210> 23 

<211> 958 

<212> DNA 

<213> Homo sapiens 



660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 



oaccgooacc gtgcccgcgt ccttctcatc agcaagctgt cctgtgtggg catctggata 
ctagccacag tgctctccat ccoagagctc ctgtacagtg acctccagag gagcagcagt 
gagcaagoga tgogatgctc tctcatcaca gagcatgtgg aggcctttat caccatccag 
gtggcccaga tggtgatcgg ctttctggtc cccctgctgg ccatgagott ctgttacctt 
gtcatcatcc gcaccctgct ccaggcacgc aaetttgagc gcaacaaggc catcaaggtg 
atcatogotg tggtcgtggt cttcatagte ttccagctgc cctacaatgg ggtggtoctg 
goccagacgg tggccaactt caacatcaec agtagcacct gtgagctcag taagcaactc 
aacatogcct acgacgtcac ctacagcctg gectgcgtcc gctgotgcgt caaecotttc 
ttgtacgcct tcatcggcgt caagttccgc aacgatctct tcaagctott caaggacctg 
ggetgcctoa gccaggagca gctccggcag tggtcttcct gtcggcaoat ccggcgctcc 
tccatgagtg tggaggccga gaocaccacc accttetcec cataggcgac tettctgcct 
ggaotagagg gacctctcce agggtoeotg gggtggggat agggagcaga tgcaatgaot 1320 

a — j M *««a#vn^/T rm+ti*-r*nrtrt*-n anantocaaa 1380 

1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2215 



<400> 23 

ggggccggac gcgaggggcg gggcgagogc 
40 ggcgcttttt otgcccgcgg tgtctoagat 
ccaaaatgtc aaaaagacca tcttatgccc 
tgcccagcac acoagggttt gtgggataca 
acaggctggg agggaaccog agcaccaaca 
ttccaaaacc cocaaagcca ccagataagc 
45 aggtctggga ccaagtaaag gcttccaacc 
ttattggtgg catgtggcga gatctcaotg 
acgaagcaga aaagatagag tacaatgaat 
accttgctta cataaatgca aaaagtogtg 
agagacaatc togcatggag aaaggagaac 
50 cagatgatta tgatgatggc ttttcaatga 
acoaccgcct catcagtgaa attcttagtg 
tcacaacagc tagaatgcag gtcctoaaac 
gaaaactaga agotgaactt cttcaaatag 
tcctggaaag cacagattca tttaacaatg 



gggacaaagg gaagcgaagc cggagctgcg 60 

toattcttaa ggaactgaga acttaatctt 120 

cacctccoao cccagctcct gcaacacaaa 180 

atccatacag tcatotcgcc tacaacaact 240 

gccgggtcao ggcatcctct ggtatcacga 300 

cgctgatgcc ctacatgagg tacagcagaa 360 

ctgacctaaa gttgtgggag attggcaaga 420 

atgaagaaaa acaagaatat ttaaaegaat 480 

ctatgaaggc ctatcataat tcccccgcgt 540 

cagaagctgc tttagaggaa gaaagtcgac 600 

ogtacatgag cattcagoct gctgaagatc 660 

agcatacagc caccgccogt ttccagagaa 720 

agagtgtggt gccagacgtt cggtcagttg 780 

ggcaggtcca gtccttaatg gttcatcagc 840 

aggaacgaca ccaggagaag aagaggaaat 900 

aacttaaaag gttgtgcggt ctgaaagfc 958 
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<210> 24 

<211> 6483 

<212> DNA 

<213> Homo sapiens 



<400> 24 

aagcttctaa ttgcagttaa accacctgtt acatatcttc aggaaaaaat cacaacctct 
caaottcaac ttcctottct ataaattaga aataacaata acoaoacctg taaccccagc 
actttgggag gccaaggcag gcagatcaag aggtgaggag attgagacoa tcctggotaa 
catgatgaaa occtgtctct accaaaaaga caaaaaatta gccaggtatg gtggoaoaca 
cotgtagtcc oagotactcg ggaggctgag gcaggagaat ggegtgaaoo cgggaggtgg 
agcttgcagt gagcogagat ggcgccactg cactccagcc tgggcgacag agcaagcctc 
cgtctaaaaa aaaaaaaaga aagaaagaaa gaaagaaaga aaagaaataa taataacoac 
cattcctate toaacagott gttctagaaa tttttaaago acagtatcao aaacagcact 
acataattgt aaaacatgta tgaatatata catocaaaca acagcaatgt catagoatat 
gggtagatat aatettatac aatgtaccaa aatcccaatt tacttcacta gacaaactgt 
tataccaaat totgtacaca gtatatccaa gaaaatgtgt tgtttttatt gagaaaotga 
acctagcttg ggaacaeatg tgoaeagtct agtteataat atttggtgca agtatcattc 
totaatatag atttacattt ttgcaagcaa atttttaott gcaatcgtaa catatccaaa 
ttttcccttt ttactcaatc agaacttagt gtaaagtact acaagttagt tcttcggatt 
tcatgctaag aaaataatgc agattttetg cattattatg gtottcacag aaacettaac 
tatgatgaat ttaaaagtgc aaaataatcc aggataactt tatgatttca cattttttaa 
tgttaaaaat aatgccatca ttaattagaa aattctaaaa tcattacttc cactttctta 
ggcaaaatat caatatactc tcatttgcca aataaattaa aagatctcct acaaacacaa 
tctcctaaat tgtggtttta tggctttaat gttttatgtg tggoaactat tgatgctagt 
taaaatttta gaaactcttt ctttttgatt ccctacagtt gtctacaaga accttattgt 
agcatgatcc tgccagactt tatactattt gttgctocaa ttaaaactgt ttaaaacatg 
aatttgaaaa atcttatttt aactataatt ttgtagctga aaottttttt tctaaactfct 
gcaaacattc tatgcaacct gaattagtgc tgagaaaatt ggatcttaat ggttgctcaa 
tgttcttcaa caggtgaaaa gcataataaa acatgctcat ctgaactcca cccattttca 
atttcaacat agcatacctc gtgtttattc ttagggcaaa ttcaaaattg tacatattag 
gattggttat taotgaagat aatttatgca atcataagcc aaagatgota agttggcaaa 
aagaaaacaa tgtaagtaag caaactctaa cacatgtgga oaoaocctct cagtatataa 
aggcttgtea ctgtccttgg tagcaggcac tccctgggct aaacagcato aocatgtctg 
ttcgatacag atoaagcaag cactactctt octcccgcag tggaggagga ggaggaggag 
gaggatgtgg aggaggagga ggagtgtcat ccctaagaat ttotagoago aaaggctccc 
ttggtggagg atttagotca ggggggttca gtggtggctc ttttagccgt gggagotctg 
gtgggggatg atttgggggc tcatcaggtg gctatggagg attaggaggt tttggtggag 
gtagotttoa tggaagctat ggaagtagca gctttggtgg gagttatgga ggcagctttg 
gagggggcaa tttcggaggt ggcagctttg gtgggggcag ctttggtgga ggcggctttg 
gtggaggcgg ctttggagga ggctttggtg gtggatttgg aggagatggt ggccttctct 
ctggaaatga aaaagtaacc atgcagaatc tgaatgaccg cctggcttcc tacttggaca 2160 
aagttcgggc tctggaagaa tcaaactatg agctggaagg caaaatcaag gagtggtatg ?»» 
aaaagcatgg caactcacat cagggggagc ctcgtgacta cagcaaatac tacaaaacca 
tcgatgacct taaaaatcag gtaagaggta tttttaaatc cagctttaag tatcttgtcc 
atgtaatcca gacagatgaa tcttaaatta agcacaatgt ggctgttcac tatgcttacc 
catgttactt tcttccttca aaaataaccc agtctcatca aagataaaca tctgtgaaac 
tatggtcatg gcaatcttca tccagcaagt gtgctacttg tcttaagagg atgggagatt 
tactaagcac ttttgaggtt ttaatgagca tacaatgagt ccacagttaa aatatgctag 2580 
gctatttaca aatgtagaaa ctgaaaaaaa aaatcatgat atgaatcaga acaaaatgtt 2640 
attcagactg ataacaagcc atattcagta ccaacatggc aagaaaaata aattttccag 2700 
tatgaaaatg ggacactgct tgcttctaag gaatttctga attgtaccta ttgtgtacca 
gttcagagtg tatttattta ttagtattta tcatgagtta aacaaatgca ggtgtgagtc 
agccaaagca tggctgaaat acatggaaat cacatagtct aaaagaggag ggcacactta 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 

ittttttaa 960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 



2760 
2820 
2880 
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caggaataoa tctatataat tccagttagt tttcagaaag gaataattcg tgtacagaaa 
tacaagactg gagaaattcc aagagaacaa ataattcaaa gttaagtata tgggtaagoo 
tgcaatattt catatttaaa ataaaaaatt ttcccaagat tttgtaagag aacaacataa 
aagtgcagag tgcatetatg toactacaaa agccatatct gcatctgacc tcttctoaaa 
taactgtgcc totooctcoa gattctoaac ctaacaactg ataatgccaa catoctgctt 
oagatcgaca atgccaggct ggcagotgat gacttcaggc tgaagtaagt taagtgatog 
ttgtataata ctatcacaac gaatacatca gtggttttta acaatgactt gggatgocot 
caataacatt tacatttttc tgaattoacc caaagttaaa tagtattgga gttatctgag 
aaattttcca tgtcagtgtt acctttttgg caatattaaa ggaagaaaat gcatattaaa 
ataactgcta aggtttttte cattaaacoa otattacttc taagagaact gtacatgaca 3480 

* . . __«... j.«.«.4. 4- -, ~4-„4- r.t-rrr.~mrwa*- 3540 

3600 
3660 
3720 
3780 
3840 



gtttt»a««.d duuAOAaoww vwy^^u ~~ rj j j 

taacaccagc cgtgaaaaat ggoatgatca aaatgtcata ccttaagcat ttttttgggo 
ttaacaatgt aaagttgaaa tttccttott tttacaatat ttgcttgtta attactaagg 
atccctacag actgtttaaa attttttttc catcattcac acagatacta acaaaaccag 
agtaatcaag acaattattg aagaggtggc goccgacggt agagttcttt oatctatggt 
tgaatcagaa accaagaaac actactatta aactgcatca agaggaaaga gtctcccttc 
acacagaoca ttatttacag atgcatggaa aacaaagtct ccaagaaaac acttctgtct 
tgatggtcta tggaaataga ccttgaaaat aaggtgtcta caaggtgttt tgtggtttct 



2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
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aatattgcca ttacatgaga teaaetatgt agttgctttt taaatagtct ctgcccagat 
acatctcccc tatataagtt ataaccagta ttgatatoat gcttgtttca ggtatgagaa 
tgaggtagct ctgcgccaga gcgtggagge tgacatcaac ggcctgcgta gggtgctgga 
tgagotgacc ctgaccaagg ctgacctgga gatgcaaatt gagagcctga ctgaagagct 
ggoctatotg aagaagaacc acgaggaggt gaoacaaaag ttatactttt cccagccaaa 
agagagttca ttatggtcct egtgtagcea ataaatcttt ctgttcctca aacaggaaat 
gaaagacott cgaaatgtgt ccactggtga tgtgaatgtg gaaatgaatg ctgocccggg 3900 
tgttgatotg actcaacttc tgaataacat gagaagecaa tatgaacaao ttgctgaaca 3960 
aaaccgoaaa gatgctgaag cctggttcaa tgaaaaggta aagtaatatt oottatagtg 4020 
aaactcatgg aggttttatc atttcagaat ttcctcaccc ttttocttgt ttttaatact 4080 
ctagagcaag gaactgacta oagaaattga taataacatt gaacagatat ccagotataa 4140 
atctgagatt aotgaattga gaogtaatgt acaagctetg gagatagaac tacagtccoa 4200 
actggecttg gtatgttaac tctcatgaaa tgacttcaac tttatcatac aaagtttcat 4260 
gctcaectaa gaatatgoaa tgcaacaaaa aaatgoagag ttggaggtaa gaaagagaaa 4320 
acaaagtgaa getcatgtta atggaggaaa agtactacta gtgttgatct aaaagtgctg 4380 
aaaotgaaat ggtgccatta aacataoaac aaattctgtt cattttctta ttcttctata 4440 
taatgcctta ctaaataatc aaataagcgt caocatactc aactgaacaa ggaagtcact 
aagccacaaa aaaatcegtt tcagaaacaa tcoctggaag cctcettgge agaaacagaa 
ggtcgctact gtgtgcagct ctcacagatt cacgcccaga tatccgctct ggaagaacag 
ttgcaacaga ttcgagctga aaccgagtgc cagaatactg aataccaaca actcctggat 
attaagatcc gactggagaa tgaaattcaa acctaccgca gcctgctaga aggagaggga 
aggtaaatta taacatgaaa agttatcooa gtttctttta ttcaatattc oagatagoaa 
ggcttatcta aaccccaaga agatgccaga gaatgagagg aagggaggag agagggtaga 
gtacagaaaa aggagtacgc aaccgcaatc tcactttctc atgaatttgg cccaaaatga 
ttcttaagag ttctgtgaac ttaacattgt tttcaaagga tgggttttaa aatatatacc 
tggcagggtt ttattttttc aacacgtttt gcttattttc taaattaacg goaactggaa 
agctaoccao cgttttccaa cgttagagat aaccgaatgt gacctcacco cgtttagttc 
cggaggcggc ggacgcggcg gcggaagttt cggcggcggo tacggcggcg gaagctccgg 
cggcggaagc tccggcggcg gctacggcgg cggccacggo ggcagttccg goggcggcta 
oggaggogga agctccggcg gcggaagctc cggcggcggo tacgggggcg gaagctccag 
cggcggccac ggcggcggaa gctccagcgg cggccacggo ggcagttcca gcggcggcta 
cggtggtggo agttccggcg gcggcggcgg cggctacggg ggcggcagct ccggcggcgg 
cagcagctce ggcggcggat acggcggcgg cagctccagc ggaggccaca agtcctcctc 
ttccgggtcc gtgggcgagt cttcatctaa gggaccaagg tcagcagaaa otagctgggg 
taatctagaa ttagttttaa cttcctgtga tggttttttt gcgctttaag ctctagagtt 
qttttaaaaa attaaaaatc ttagagacgg ttccgtttgc atttgttcac aaactactot 5640 

■-- --4-4— _ -4- 4- 4- 4- 4-4-4- r, 5700 

5760 
5820 
5880 
5940 
6000 
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gtatttottc ttttcacttt accacaaagt gttctttaat ggaaagaaaa acaactttgt 6120 
gttctcattt actaatgaat ttcaataaac tttcttactg atgcaaacta tcccaatttg 6180 
tcagaattta tctttactta agtacataat actctttaaa attaaagatt agtaacccat 6240 
agcagttgaa ggttgatgta tccagaaatt cggaagacag aactattgtc atgccttttc 6300 
taagtttttt aatcatgtat gttcagacca ccgtcagtaa attoactgag taaagtctgt 6360 
aaatccccaa tattactctt taagatacac aatatgtgga aggctcccag ctctctggct 6420 
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ttaaattatt tcaatcotgg aaattctgga atatctcaaa tataaooocc aaaataataa 6480 
taa 6483 
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agttgtggcc accttcccca ggccatggat ctcteeaaca aoaocatgto aetctoagtg 
cgcacccccg gactgtoccg gcggctotcc tcgcagagtg tgataggcag aocoaggggc 
atgtotgctt ccagtgttgg aagtggttat gggggaagtg eetttggctt tggagccagc 
tgtgggggag gcttttotgc tgcttecatg tttggttcta gttccggott tgggggtggc 240 
tcoggaagtt ecatggoagg aggactgggt getggttatg ggagagccct gggtggaggt 300 
agctttggag ggctggggat gggatttggg ggcagcccag gaggtggctc tctaggtatt 360 
ctotogggca atgatggagg ecttctttct ggatcagaaa aagaaactat gcaaaatott 
aatgatagat tagcttocta cotggataag gtgcgagctc tagaagaggc taatactgag 
ctagaaaata aaattcgaga atggtatgaa acacgaggaa ctgggactgc agatgottea 



ay^Luu^^yy ^yyy'-yy^^**' »yy^"yy w " — -> — — j - j -- - 

gacotcacoa ggctcctcaa tgatatgcgg gcgcagtatg aaaccatcgc tgagcagaat 
cggaaggacg ctgaagcctg gttcattgaa aagagcgggg agctccgtaa ggagattagc 
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oagagegatt acagcaaata ttatccactg attgaagacc tcaggaataa gateatttea 600 

gccagcattg gaaatgccca gctcctcttg cagattgaca atgcgagact agetgotgag 660 

gacttcagga tgaagtatga gaatgaactg gecotgcgcc agggcgtaga ggccgacatc 720 

aatggcctgc gccgggtgct ggacgagctg accotgaoca ggaeegacct ggagatgcag 780 

atcgagagcc tgaacgagga gctggcctac atgaagaaga accacgagga tgagctccaa 840 

agcttccggg tgggcggcec aggcgaggtc agcgtagaaa tggacgctgc ccccggagtg 900 
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«v_>_«»w=^^ agcagcttca gtccagcaag agcgaggtca ccgacctgcg togogcottt 1080 
cagaacctgg agatcgagct acagtcccag ctcgccatga agaaatooot ggaggactoe 
ttggccgaag ccgagggcga ttactgcgcg cagotgtocc aggtgcagca gctcatcagc 
aacotggagg cacagctgct ccaggtgcgc gcggaogcag agcgccagaa ogtggaccac 
cagcggctgc tgaatgtcaa ggcccgcctg gagotggaga ttgagacota ccgccgcctg 
ctggacgggg aggcccaagg tgatggtttg gaggaaagtt tatttgtgac agactocaaa 
tcacaagcac agtoaaotga ttcctctaaa gacccaacca aaacccgaaa aatcaagaca 
gttgtgoagg agatggtgaa tggtgaggtg gtctcatotc aagttcagga aattgaagaa 
otaatgtaaa atttoacaag atctgcccoa tgattggttc cttaggaaca agaaatttac 
aagtagaaat tattcotttc agagtaacat gctgtattao ttcaatcoct atttttgtct 
gttooatttt ctttggattc cctattcaca ttgaatcctt tttgcccttc tgaaacaata 1680 
ttoagtcaca agtcattttg gtcatgttgg tctttgtaac aaatcaaaat tacottatat 1740 
cottctggac aaotggagta gtcttttaac gaactttctt ctggtaaccc ggaatatttt 1800 
cttaatcata gagctttact caagtagtat tgttttaata gagttaattg taataaaaga 1860 
tgaatggtaa a 1871 
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caagtaccag acggagcagt ccctgcggca gctggtggag tcogacatoa aoagcctgcg 
caggattctg gatgagctga ccctgtgcag gtctgacctg gaggcccaga tggagtccct 
gaaggaggag ctgctgtcco tcaagcagaa coatgagcag gaagtcaaca ccttgcgctg 
ocagcttgga gaocgcctca acgtggaggt ggacgctgot cccgctgtgg acctgaacca 
ggtcctgaac gagaocagga atcagtatga ggccctggtg gaaaocaacc gcagggaagt 
ggagoaatgg ttcgccacgc agaeogagga gctgaaoaag caggtggtat ccagotcgga 
gcagctgcag tcctaccagg cggagatcat ogagctgaga cgcacagtoa atgccctgga 
gatcgagctg caggcccagc acaacctgcg atactctctg gaaaacacgc tgacagagag 
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atcatggtot ggaaggagaa eaaatgccca gogtttgggt ctgactctga gcctagggct 
actgatcctc ctoaccccag gtocotctcc tgtagtcagt ctgagttotg atggtcagag 1320 
gttggagctg tgacagtggc ataegaggtg ttttgttetc totgctgott ctacctttat 1380 
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1447 




<400> 27 


























Met 


Asn 


Pro 


Asn 


Cys Ala 


Arg 


Cys 


Gly 


Lys 


He 


Val 


Tyr 


Pro 




Glu 


1 








5 








10 










15 




Lys 


Val 


Asn 


Cys 


Leu Asp 


Lys 


Phe 


Trp 


His 


Lys 


Ala 


Cys 


Phe 


His 


Cys 






20 








25 










30 






Glu 


Thr 


Cys 
35 


Lys 


Met Thr 


Leu 


Asn 
40 


Met 


Lys 


Asn 


Tyr 


Lys 
45 


Gly 


Tyr 


Glu 


Lys 


Lys 


Pro 




Cys Asn 


Ala 


His 


Tyr 


Pro 




Gin 


Ser 


Phe 




Met 


50 








55 










60 










Val 


Ala 


Asp 


Thr 


Pro Glu 


Asn 




Arg 




Lys 


Gin 


Gin 


Ser 


Glu 


Leu 


65 






70 










75 










80 


Gin 


Ser 


Gin 


Val 


Arg Tyr 
85 


Lys 


Glu 


Glu 


Phe 
90 


Glu 


Lys 






Gly 
95 


Lys 


Gly 


Phe 


Ser 


Val 


Val Ala 


Asp 


Thr 


Pro 


Glu 


Leu 


Gin 


Arg 


He 




Lys 






100 








105 










110 






Thr 


Gin 


Asp 
115 


Gin 


lie Ser 


Asn 


He 
120 


Lys 


Tyr 


His 


Glu 


Glu 
125 


Phe 


Glu 


Lys 


Ser 


Arg 
130 


Met 


Gly 


Pro Ser 


Gly 
135 


Gly 


Glu 


Gly 


Met 


Glu 
140 


Pro 


Glu 


Arg 


Arg 


Asp 


Ser 


Gin 


Asp 


Gly Ser 


Ser 


Tyr 


Arg 


Arg 


Pro 


Leu 


Glu 


Gin 


Gin 


Gin 


145 








150 










155 










160 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 



112 



EP 1 365 034 A2 



Pro Hi 8 


His 


He 


Pro 


Thr 


Ser 


Ala Pro 


Val 


Tyr 


Gin 


Gin 


Pro 


Gin 


Gin 






165 








170 










175 




Gin Pro 


Val 


Ala 


Gin 


Ser 


Tyr 


Gly Gly 


Tyr 


Lys 


Glu 


Pro 


Ala 


Ala 


Pro 






180 








185 










190 






Val Ser 


He 


Gin 


Arg 


Ser 


Ala 


Pro Gly 


Gly 


Gly 


Gly 


Lys 


Arg 


Tyr 


Arg 




195 








200 








205 








Ala Val 


Tyr 


Asp 


Tyr 


Ser 


Ala 


Ala Asp 


Glu 


Asp 


Glu 


Val 


Ser 


Phe 


Gin 


210 
















220 










Asp Gly 


Asp 


Thr 


He 


Val 


Asn 


Val Gin 


Gin 


He 


Asp 


Asp 


Gly 


Trp 


Met 


225 
























240 


Tyr Gly 


Thr 


Val 


Glu 


Arg 


Thr 


Gly Asp 


Thr 


Gly 


Met 


Leu 


Pro 


Ala 


Asn 
























255 




Tyr Val 


Glu 


Ala 


He 
















































<210> 28 


























<211> 478 


























<212> PRT 


























<213> Homo 


sapiens 






















<400> : 


>8 
























Gin 


Mat Val 


Gin 


Lys 


Thr 


Ser 


Met 


Ser Arg 


Gly 


Pro 


Tyr 


Pro 


Pro 


Ser 


1 




5 








10 










15 




Glu lie 


Pro 


Met 


Glu 


Val 


Phe 


Asp Pro 


Ser 


Pro 


Gin 


Gly 


Lys 


Tyr 


Ser 






20 








25 










30 






Lys Arg 


Lys 


Gly 


Arg 


Phe 


Lys 


Arg Ser 


Asp 


Gly 


Ser 


Thr 


Ser 


Ser 


Asp 


35 










40 








45 








Thr Thr 


Ser 




Ser 


Phe 


Val 


Arg Gin 


Gly 


Ser 


Ala 


Glu 


Ser 


Tyr 


Thr 


50 










55 








60 










Ser Arg 


Pro 


Ser 


Asp 


Ser 


Asp 


Val Ser 


Leu 


Glu 


Glu 


Asp 


Arg 


Glu 


Ala 


65 






70 








75 










80 


Leu Arg 


Lys 


Glu 


Ala 


Glu 


Arg 


Gin Ala 


Leu 


Ala 


Gin 


Leu 


Glu 


Lys 


Ala 




85 








90 










95 




Lys Thr 


Lys 


Pro 


Val 


Ala 


Phe 


Ala Val 


Arg 


Thr 


Asn 


Val 


Gly 


Tyr 


Asn 


100 








105 










110 






Pro Ser 


Pro 


Gly 


Asp 


Glu 


Val 


Pro Val 


Gin 


Gly 


Val 


Ala 


He 


Thr 


Phe 




115 






120 








125 








Glu Pro 


Lys 


Asp 


Phe 


Leu 


His 


He Lys 


Glu 


Lys 


Tyr 


Asn 


Asn 


Asp 


Trp 


130 






135 








140 










Trp lie 


Gly 


Arg 


Leu 


Val 


Lys 


Glu Gly 


Cys 


Glu 


Val 


Gly 


Phe 


He 


Pro 


145 




150 








155 










160 


Ser Pro 


Val 




Leu 


Asp 


Ser 


Leu Arg 


Leu 


Leu 


Gin 


Glu 


Gin 


Lys 


Leu 






165 






170 










175 




Arg Gin 


Asn 


Arg 


Leu 


Gly 


Ser 


Ser Lys 


Ser 


Gly 


Asp 


Asn 


Ser 


Ser 


Ser 




180 








185 










190 




Ala 


Ser Leu 


Gly 


Asp 


Val 


Val 


Thr 


Gly Thr 


Arg 


Arg 


Pro 


Thr 


Pro 


Pro 




195 








200 








205 








Ser Ala 


Lys 


Gin 


Lys 


Gin 


Lys 


Ser Thr 


Glu 


His 


Val 


Pro 


Pro 


Tyr 


Asp 


210 








215 








220 










Val Val 


Pro 


Ser 


Met 


Arg 


Pro 


He He 


Leu 


Val 


Gly 


Pro 


Ser 


Leu 


Lys 


225 








230 








235 










240 


Gly Tyr 


Glu 


Val 


Thr 


Asp 


Met 


Met Gin 


Lys 


Ala 


Leu 


Phe 


Asp 


Phe 


Leu 






245 








250 










255 





113 



EP 1 365 034 A2 



Lys His Arg 


Phe Asp Gly 


Ara 


He 


Ser He 


Thr Arg 


Val 


Thr 


Ala 


Asp 


260 








265 






270 






lie Ser Leu 


Ala Lys Arg 


Ser 


Val 




Asn Pro 


Ser 


Lys 


His 


He 










280 






285 










Arg Ser 




Thr 


Arg 


Ser Ser 


Leu Ala 


Glu 


Val 


Gin 


Ser 






295 






300 












Arg He 


Phe 




Leu 


Ala Arg 




Gin 


Leu 


Val 


Ala 


305 * " 


310 








315 








320 


Leu Asp Ala 


Asp Thr He 


Asn 


His 


Pro Ala 




Ser 


Lys 


Thr 


Ser 


325 








330 








335 




U 


He He 


Val 


Tvr 


He 


Lys He 


Thr Ser 


Pro 


Lys 


Val 


Leu 




340 








345 






350 






Gin Arg Lsu 


He Lys 


Ser 


Ara 


Gly 


Lys Ser 


Gin Ser 


Lys 


His 


Leu 


Asn 










360 






365 








Val Gin He 


Ala Ala 


Ser 


Glu 


Lys 


Leu Ala 


Gin Cys 


Pro 


Pro 


Glu 


Met 








375 






380 










Phe Asp Ha 


He Leu Asp 


Glu 


Asn 


Gin Leu 




Ala 


Cys 


Glu 


His 






390 








395 








400 


Leu Ala Glu 


Tyr Leu 


Glu 


Ala 


Tvr 


Trp Lys 


Ala Thr 


His 


Pro 


Pro 


Ser 




405 








410 








415 




Ser Thr Pro 


Pro Asn 


Pro 


Leu 


Leu 


Asn Arg 


Thr Met 


Ala 


Thr 


Ala 


Ala 




420 








425 






430 






Leu Arg Arg 


Ser Pro 


Ala 


Pro 


Val 


Ser Asn 


Leu Gin 


Val 


Gin 


Val 


Leu 


435 








440 






445 








Thr Ser Leu 


Arg Arg 


Asn 


Leu 


Gly 


Phe Trp 


Gly Gly 


Leu 


Glu 


Ser 


Ser 


450 






455 






460 










Gin Arg Gly 


Ser Val 


Val 


Pro 


Gin 


Glu Gin 


Glu His 


Ala 


Met 






465 




470 








475 










<210> 29 






















<211> 196 






















<212> PRT 






















<213> Homo 


sapiens 




















<400> 29 






















Met Ser Met 


Leu Arg Leu 


Gin 


Lys 


Arg Leu 


Ala Ser 


Ser 


Val 


Leu 


Arg 


1 


5 








10 








15 






Lys Lys Val 


— 


Leu 




Asn Glu 


Thr 


Asn 


Glu 


He 


20 








25 ? 






30 






Ala. Asn Ala. 


Asn Ser Arg 


Gin 


Gin 


He Arg 


Lys Leu 


He 


Lys 


Asp 


Gly 


35 








40 






45 








Leu He He 


Arg Lys 


Pro 


Val 




Val His 


Ser Arg 


Ala 


Arg 


Cys 


Arg 


50 






55 






60 










Lys Asn Thr 


Leu Ala 


Arg 


Arg 


Lvs 


Glv Ara 




Gly 


He 


Gly 


Lys 




70 








75 








80 


Arg Lys Gly 


Thr Ala 


Asn 


Ala 


Arg 


Met Pro 


Glu Lys 


Val 


Thr 


Trp 


Met 


85 








90 








95 




Arg Arg Mat 


Arg He 


Leu 


Arg 


Arg 


Leu Leu 


Arg Arg 


Tyr 


Arg 


Glu 


Ser 


100 








105 






110 






Lys Lys He 


Asp Arg 


His 


Met 


Tyr 


His Ser 


Leu Tyr 


Leu 


Lys 


Val 




115 








120 






125 








Gly Asn Val 


Phe Lys Asn 


Lys 


Arg 


He Leu 


Met Glu 


His 


He 


His 


Lys 


130 






135 






140 











114 



EP 1 365 034 A2 



Leu Lys Ala Asp Lys Ala Arg Lys Lys Leu Leu Ala Asp Gin Ala Glu 
145 150 155 160 

Ala Arg Arg Ser Lys Thr Lys Glu Ala Arg Lys Arg Arg Glu Glu Arg 

165 170 175 

Leu Gin Ala Lys Lys Glu Glu lie He Lys Thr Leu Ser Lys Glu Glu 

180 185 190 

Glu Thr Lys Lys 
195 

<210> 30 

<211> 1566 

<212> PRT 

<213> Homo sapiens 



20 


<400> 30 


















Met 


Ser 


Ser 


Leu 


Leu 


Glu 


Arg 


Leu His Ala 


Lys Phe Asn Gin 


Asn Arg 




1 








5 




10 




15 




Pro 


Trp 


Ser 


Glu 


Thr 


He 


Lys 


Leu Val Arg 


Gin Val Mat Glu 


ys g 








20 














25 


Val 


Val 


Met 


Ser 


Ser 


Gly 


Gly 


His Gin His 


Leu Val Ser Cys 


eu 




35 






40 








Thr 


Leu 


Gin 


Lys 


Ala 


Leu 


Lys 


Val Thr Ser 


U «n° * 


Thr Asp 






50 








55 










Arg 


Leu 


Glu 


Ser 


He 


Ala 


Gly 


Gin Asn Gly 


Leu Gly Ser His 


Leu Ser 




65 










70 








80 
Val Glu 


30 




Ser 


Gly Thr 


Glu 


Cys 


Tyr 


He Thr Ser 


As Met Phe Tyr 
p e Tyr 












85 






90 




95 




Val 


Gin 


Leu Asp 


Pro 


Ala 


Gly 


Gin Lau Cys 


Asp Val Lys Val 


Ala His 










100 








105 


110 






His 


Gly 


Glu Asn 


Pro 


Val 


Ser 


Cys Pro Glu 


Leu Val Gin Gin 


Leu Arg 


35 




115 










120 


125 




Glu 


Lys 
130 


Asn 


Ser 


Asp 


Glu 


Phe 
135 


Ser Lys His 


Leu Lys Gly Leu 
140 


Val Asn 




Leu 


Tyr 


Asn 


Leu 


Pro 


Gly 


Asp 


Asn Lys Leu 


Lys Thr Lys Met 


Tyr Leu 




145 








150 






155 


160 




Ala 


Leu 


Gin 


Ser 


Leu 
165 


Glu 


Gin 


Asp Leu Ser 
170 


Lys Met Ala He 


Met Tyr 
175 




Trp 


Lys 


Ala 


Thr 


Asn 


Ala 


Gly 


Pro Leu Asp 


Lys He Leu His 


Gly Sar 






180 








185 


190 






Val 


Gly 


Tyr 


Leu 


Thr 


Pro 


Arg 


Ser Gly Gly 


His Leu Met Asn 


Leu Lys 






195 










200 


205 






Tyr 


Tyr 


Val 


Ser 


Pro 


Ser 


Asp 


Leu Leu Asp 


Asp Lys Thr Ala 


Ser Pro 




210 










215 




220 






He 


He 


Leu 


His 


Glu 


Asn 


Asn 


Val Ser Arg 


Ser Leu Gly Met 


Asn Ala 




225 










230 






235 


240 




Ser 


Val 


Thr 


He 


Glu 


Gly 


Thr 


Ser Ala Val 


Tyr Lys Leu Pro 


lie Ala 


50 










245 




250 




255 


Pro 


Leu 


He 


Met 


Gly 


Ser 


His 


Pro Val Asp 


Asn Lys Trp Thr 


Pro Ser 










260 






265 


270 






Phe 


Ser 


Ser 


He 


Thr 


Ser 


Ala 


Asn Ser Val 


Asp Leu Pro Ala 


Cys Phe 






275 










280 


285 






Phe 


Leu 


Lys 


Phe 


Pro 


Gin 


Pro 


He Pro Val 


Ser Arg Ala Phe 


Val Gin 


55 




290 








295 




300 





115 



EP 1 365 034 A2 





As 
n n 


cys 




Glv 


He 


Pro Leu 


Phe Glu Thr 


Gin 


Pro Thr 








310 








315 




320 


Tyr Ala. 


Pro Leu 




Glu 


Leu 


He 


Thr Gin 


Phe Glu Leu 


Ser 


Lys Asp 




325 








330 






335 


Pro Asp 




Pro 


Leu 


Asn 


His 


Asn Met 


Arg Phe Tyr 


Ala 


Ala Leu 


340 










345 




350 




Pro Gly 


Gin Gin 


His 


Cys 


Tyr 


Phe 


Leu Asn 


Lys Asp Ala 


Pro 


Leu Pro 


355 






360 




365 






Asp Gly 


Arg Ser 


Leu 


Gin 


Gly 


Thr 


Leu Val 


Ser Lys He 


Thr 


Phe Gin 
















380 






His Pro 


Gl Ar 


Val 


Pro 


Leu 


He 


Leu Asn 


Leu He Arg 


His 


Gin Val 






390 








395 




400 


Ala Tyr 


As Thr 
n 


Leu 


He 


Gly 


Ser 


Cys Val 


Lys Arg Thr 


He 


Leu Lys 




405 








410 






415 


Glu As 
G u p 




Gly 


eu 


Leu 
eu 


Gin 


Phe Glu 


Val Cys Pro 


Leu 


Ser Glu 










425 




430 




Sar Arg 




a 


Ser 
er 


Phe 


Gin 




Val Asn Asp 


Ser 


Leu Val 


435 ^ 








440 




445 








V 1 Mat 


As 

P 


Val 


Gin 


Gly 


Leu Thr 


His Val Ser 


Cys 


Lys Leu 








455 






460 






Tyr ys 


Gly Leu 


Ser 


Asp 


Ala 


Leu 


He Cys 


Thr Asp Asp 


Phe 


He Ala 






470 








475 




480 


ys a 


Val Gin 




Cys 


Mat 


Ser 


He Pro 


Val Thr Mat 


Arg 


Ala He 




485 








490 






495 


Arg Arg 


T 




Thr 


He 


Gin 




Thr Pro Ala 


Leu 


Ser Leu 












505 




510 




Ha Ala 




V 1 


Glu 
U 


As 

P 


Met 


Val Lys 


Lys Asn Leu 


Pro 


Pro Ala 




^ u 








520 




525 








Pro Gly 


Tyr 


Gl 

Y 


Met 




Thr Gly 


Asn Asn Pro 


Met 


Ser Gly 


530 




535 






540 








Thr Ser 
r ar 


Thr 


Asn 


Thr 


Phe 


Pro Gly 


Gly Pro He 


Ala 


Thr Leu 


545 r 






550 








555 




560 


sn 


er 


Met 


Ser 


He 


L 


Asp Arg 


His Glu Ser 


Val 


Gly His 






565 








570 






575 


Gly Glu 


Asp Phe 


Ser 


ys 


Val 


Ser 




Pro He Lau 


Thr 


Ser Leu 


580 










585 




590 




Leu Gin 




Gl 


Asn 


Glv 


Gly 




He Gly Ser 


Ser 


Pro Thr 












600 




605 






Pro Pro 


His His 
is is 


Thr 


Pro 


Pro 




Val Ser 


Ser Met Ala 


Gly 


Asn Thr 


610 








615 






620 






Lys Asn 


His Pro 


Met 


Leu 


Met 


Asn 


Leu Leu 


Lys Asp Asn 


Pro 


Ala Gin 








630 








635 




640 


Asp e 


Ser Thr 


Leu 


Tyr 


Gly 


Ser 


Ser Pro 


Leu Glu Arg 


Gin 


Asn Ser 




645 








650 






655 


Ser ar 


Gly Ser 


Pro 


Ar 


Mat 


Glu 


He Cys 


Ser Gly Ser 


Asn 


Lys Thr 




660 










665 




670 




Lys Lys 


Lys Lys 


Ser 


Ser 


Arg 


Leu 




Glu Lys Pro 


Lys 


His Gin 










680 




685 








P P 




Gin 


Arg 


Glu 


Leu Phe 


Ser Mat Asp 


Val 


Asp Ser 


690 






695 






700 






Gin Asn 


Pro He 


Phe 


Asp 


Val 


Asn 


Met Thr 


Ala Asp Thr 


Leu 


Asp Thr 


705 






710 








715 




720 


Pro His 


lie Thr 


Pro 


Ala 


Pro 


Ser 


Gin Cys 


Ser Thr Pro 


Pro 


Thr Thr 






725 








730 






735 


Tyr Pro 


Gin Pro 


Val 


Pro 


His 


Pro 


Gin Pro 


Ser He Gin 


Arg 


Met Val 


740 










745 




750 




Arg Lau 


Ser Ser 


Ser 


Asp 


Ser 


He 


Gly Pro 


Asp Val Thr 


Asp 


He Leu 


755 








760 




765 







116 



EP 1 365 034 A2 



Ser Asp 


He 


Ala Glu 


Glu Ala 


Ser Lys 


Leu 


Pro Ser 


Thr 


Ser 


Asp Asp 


770 






775 






780 








Cys Pro 


Ala 


He Gly 


Thr Pro 


Leu Arg 


Asp 


Ser Ser 


Ser 


Ser 


Gly His 


785 






790 






795 






800 


Ser Gin 


Ser 


Thr Leu 


Phe Asp 


Ser Asp 


Val 


Phe Gin 


Thr 


Asn 


Asn Asn 






805 






810 








815 


Glu Asn 


Pro 


Tyr Thr 


Asp Pro 


Ala Asp 


Leu 


He Ala 


Asp 


Ala 


Ala Gly 






820 




825 








830 




Ser Pro 


Ser 


Ser Asp 


Ser Pro 


Thr Asn 


His 


Phe Phe 


His 


Asp 


Gly Val 




835 




840 






845 






Asp Phe 


Asn 


Pro Asp 


Leu Leu 


Asn Ser 


Gin 


Ser Gin 


Ser 


Gly 


Phe Gly 


850 






855 






860 










Tyr 


Phe Asp 
P 




Ser Gin 


Ser 




Asn 


Asp 


Asp Phe 


865 




870 






875 






880 


Lys Gly 


Phe 


Ala Ser 


Gin Ala 


Leu Asn 


Thr 


Leu Gly 


Val 


Pro 


Met Leu 




885 






890 








895 


Gly Gly 


Asp 


Asn Gly 


Glu Thr 


Lys Phe 


Lys 


Gly Asn 


Asn 


Gin 


Ala Asp 




900 




905 








910 




Thr Val 


Asp 


Phe Ser 


He He 


Ser Val 


Ala 


Gly Lys 


Ala 


Leu 


Ala Pro 




915 






920 






925 






Ala Asp 


Leu 


Met Glu 


His His 


Ser Gly 


Ser 


Gin Gly 


Pro 


Leu 


Leu Thr 


930 






935 






940 








Thr Gly 


Asp 




Lys Glu 


Lys Thr 


Gin 


Lys Arg 


Val 




Glu Gly 


945 






950 






955 






960 


Asn Gly 


Thr 




Ser Thr 




Gly 


Pro Gly 




Asp 


Ser Lys 




965 






970 








975 


Pro Gly 


Lys 


Arg Ser 


Arg Thr 


Pro Ser 




Asp Gly 


Lys 


Ser 


Lys Asp 






980 




985 








990 




Lys Pro 


Pro 


Lys Arg 


Lys Lys 


Ala Asp Thr Glu Gly Lys Ser Pro Se 




995 






1000 






1005 





His Ser 


Ser Ser 


Asn Arg 


Pro 


Phe 


Thr Pro 


Pro Thr 


Ser 


Thr 


Gly 


1010 






1015 






1020 








Gly Ser 


Lys Ser 


Pro Gly 


Ser 


Ala 


Gly Arg 


Ser Gin 


Thr 


Pro 


Pro 


1025 






1030 






1035 








Gly Val 


Ala Thr 


Pro Pro 


He 


Pro 


Lys He 


Thr He 


Gin 


He 


Pro 


1040 






1045 






1050 








Lys Gly 


Thr Val 


Met Val 


Gly 


Lys 


Pro Ser 


Ser His 


Ser 


Gin 


Tyr 


1055 






1060 






1065 








Thr Ser 


Ser Gly 


Ser Val 


Ser 


Ser 


Ser Gly 


Ser Lys 


Ser 


His 


His 


1070 






1075 






1080 








Ser His 


Ser Ser 


Ser Ser 


Ser 


Ser 


Ser Ala 


Ser Thr 


Ser 


Gly 


Lys 


1085 






1090 






1095 








Met Lys 


Ser Ser 


Lys Ser 


Glu 


Gly 


Ser Ser 


Ser Ser 


Lys 


Leu 


Ser 


1100 






1105 






1110 








Ser Ser 


Met Tyr 


Ser Ser 


Gin 


Gly 


Ser Ser 


Gly Ser 


Ser 


Gin 


Ser 


1115 




1120 






1125 








Lys Asn 


Ser Ser 


Gin Ser 


Gly 


Gly 


Lys Pro 


Gly Ser 


Ser 


Pro 


He 


1130 






1135 






1140 










His Gly 


Leu Ser 


Ser 


Gly 


Ser Ser 


Ser Thr 


Lys 


Met 


Lys 


1145 






1150 






1155 










Gly Lys 




Ser 


Leu 


Met Asn 


Pro Ser 


Leu 


Ser 


Lys 


1160 






1165 






1170 








Pro Asn 


He Ser 




His 


Ser 


Arg Pro 


Pro Gly 


Gly 


Ser 


Asp 


1175 






1180 






1185 










Ala Ser 


Pro Met 


Lys 


Pro 


Val Pro 


Gly Thr 


Pro 


Pro 


Ser 


1190 






1195 






1200 








Ser Lys 


Ala Lys 


Ser Pro 


He 


Ser 


Ser Gly 


Ser Gly 


Gly 


Ser 


His 


1205 






1210 






1215 









EP 1 365 034 A2 



Met 


Ser 


Gly Thr 


Ser Ser Ser 


Ser Gly Met Lys Ser 


Ser 


Ser Gly 




1220 




1225 


1230 






Leu 


Gly 


Ser Ser 


Gly Ser Leu 


Ser Gin Lys Thr Pro 


Pro 


Ser Ser 




1235 




1240 


1245 






Asn 


Ser 


Cys Thr 


Ala Ser Ser 


Ser Ser Phe Ser Ser 


Ser 


Gly Ser 




1250 


1255 


1260 






Ser 


Met 


Ser Ser 


Ser Gin Asn 


Gin His Gly Ser Ser 


Lys 


Gly Lys 




1265 




1270 


1275 






Sar 


Pro 


Ser Arg 


Asn Lys Lys 


Pro Ser Leu Thr Ala 


Val 


e p 




1280 




1285 


1290 






Lys 


Leu 


Lys His 


Gly Val Val 


Thr Ser Gly Pro Gly 


Gly 


Glu Asp 


1295 




1300 


1305 






Pro 


Leu 


Asp Gly 


Gin Met Gly 


Val Ser Thr Asn Ser 


Ser 


Ser is 




1310 




1315 


1320 






Pro 


Met 


Ser Ser 


Lys His Asn 


Met Ser Gly Gly Glu 


Phe 


Gin Gly 




1325 




1330 


1335 






Lys 


Arg 


Glu Lys 


Ser Asp Lys 


Asp Lys Ser Lys Val 


Ser 


Thr Ser 


1340 




1345 


1350 






Gly 


Ser 


Ser Val 


Asp Ser Ser 


Lys Lys Thr Ser Glu 


Ser 


Lys Asn 


1355 




1360 


1365 






Val 


Gly 


Ser Thr 


Gly Val Ala 


Lys He lie He Ser 


Lys 


His Asp 




1370 




1375 


1380 






Gly 


Gly 


Ser Pro 


Ser He Lys 


Ala Lys Val Thr Leu 


Gin 


Lys ro 


1385 




1390 


1395 






Gly 


Glu 


Ser Ser 


Gly Glu Gly 


Leu Arg Pro Gin Met 


Ala 


Ser Ser 


1400 




1405 


1410 






Lys 


Asn 


Tyr Gly 


Ser Pro Leu 


He Ser Gly Ser Thr 


Pro 


Lys His 


1415 


1420 


1425 






Glu 


Arg 


Gly Ser 


Pro Ser His 


Ser Lys Ser Pro Ala 


Tyr 


Thr Pro 




1430 




1435 


1440 






Gin 


Asn 


Leu Asp 


Ser Glu Ser 


Glu Ser Gly Ser Ser 


lie 


Ala G u 




1445 




1450 








Lys 


Ser 


Tyr Gin 


Asn Ser Pro 


Ser Ser Asp Asp Gly 


I e 


g ro 


1460 




1465 


1470 






Leu 


Pro 


Glu Tyr 


Ser Thr Glu 


Lys His Lys Lys His 


ys 


ys u 




1475 




1480 


1485 






Lys 


Lys 


Lys Val 


Lys Asp Lys 


Asp Arg Asp Arg Asp 


Arg 


Asp Lys 




1490 




1495 


1500 






Asp 


Arg 


Asp Lys 


Lys Lys Ser 


His Ser He Lys Pro 


U 


er rp 
















Ser 


Lys 5 


Ser Pro 


Xle Ser Ser 


Asp Gin Ser Leu 


Met 


Thr Ser 




1520 




1525 


1530 






Asn 


Thr 


He Leu 


Ser Ala Asp 


Arg Pro Ser Arg Leu 


Ser 


Pro Asp 




1535 




1540 


1545 






Phe 


Met 


He Gly 


Glu Glu Asp 


Asp Asp Leu Met Asp 


Val 


Ala Leu 




1550 




1555 


1560 






He 


Gly 


Asn 












1565 













<210> 31 

<211> 1490 

<212> PRT 

<213> Homo sapiens 



118 



EP 1 365 034 A2 



<400> 31 





Met 


Pro 


Asn 


Ser 


Glu 


Arg 




1 








5 




5 


Ala 


Sar 


Gly 


Thr 


Leu 


Gin 










20 








Arg 


Glu 


Arg 


His 


Arg 


Leu 








35 










His 


Ser 


Lys 


Asp 


Met 


Gly 


10 




50 












Thr 


Val 


He 


Lys 


Pro 


Leu 




65 










70 




Asp 


Thr 


Phe 


Ser 


Asp 


Asp 










85 






Asp 


Glu 


Arg 


Arg 


Gly 


Ser 


15 








100 








His 


His 


Gin 


His 


Arg 


Arg 








115 










Glu 


Lys 


Glu 


Lys 


Ser 


Gin 






130 










20 


Asp 


Arg 


He 


Ser 


Gly 


Ser 




145 










150 




Tyr 


Gly 


Lys 


Ala 


Gin 


Val 












165 






Sar 


Lys 


Leu 


His 


Lys 


Glu 


25 








180 






Gly 


His 


Lys 


Asp 


Arg 


Ser 








195 










Sar 


Tyr 


Lys 


Thr 


Val 


Asp 






210 












Arg 


Lys 


Trp 


Ser 


Asp 


Ser 


30 


225 










230 




Sar 


Tyr 


Gly 


Gin 


Asp 


Tyr 












245 






Sar 


Asn 


Tyr 


Asp 


Ser 


Tyr 










260 








Gin 


Ser 


Val 


Ser 


Pro 


Pro 


35 






275 










Thr 


Arg 




Pro 


Ser 


Pro 






290 












Tyr 


Ser 


Arg 


Arg 


Arg 


Ser 




305 










310 


40 


Gly 


Arg 


Ser 


Pro 


Ser 


Pro 










325 






Lau 


Ser 


Lys 


Arg 


Ser 


Leu 










340 








Mat 


Lys 


Ser 
355 


Arg 


Ser 


Arg 


45 


His 


Ser 


Lys 


Lys 


Lys 


Arg 






370 












Sar 


Pro 


Val 


Arg 


Leu 


Pro 




385 










390 




Arg 


Lys 


Lys 




Glu 


Arg 


50 










405 






Gly 


Lys 


Glu 


Ser 


Lys 


Gly 










420 








Ser 


Ser 


Val 


Glu 


Ala 


Lys 








435 








55 


Arg 


Ser 


Val 


Lys 


Leu 


Glu 




450 











His 


Gly 


Gly 


Lys 


Lys 


Asp 


Gly 


Ser 


Gly 


Gly 








10 










15 




Pro 


Ser 


Ser 


Gly 


Gly 


Gly 


Ser 


Ser 


Asn 


Ser 






25 










30 






Val 


Ser 


Lys 


His 


Lys 


Arg 


His 


Lys 


Ser 


Lys 




40 










45 








Leu 


Val 


Thr 


Pro 


Glu 


Ala 


Ala 


Ser 


Leu 


Gly 


55 










60 










Val 


Glu 


Tyr 


Asp 


Asp 


He 


Ser 


Ser 


Asp 


Ser 










75 










80 


Met 


Ala 


Phe 


Lys 


Leu 


Asp 


Arg 


Arg 


Glu 


Asn 








90 










95 




Asp 


Arg 


Ser 


Asp 


Arg 


Leu 


His 


Lys 


His 


Arg 






105 










110 






Ser 


Arg 


Asp 


Leu 


Leu 


Lys 


Ala 


Lys 


Gin 


Thr 




120 










125 








Glu 


Val 


Ser 


Ser 


Lys 


Ser 


Gly 


Ser 


Met 


Lys 


135 










140 










Ser 


Lys 


Arg 


Ser 


Asn 


Glu 


Glu 


Thr 


Asp 


Asp 








155 










160 


Ala 


Lys 


Ser 


Ser 


Ser 


Lys 


Glu 


Ser 


Arg 


Ser 






170 










175 




Lys 


Thr 


Arg 


Lys 


Glu 


Arg 


Glu 


Leu 


Lys 


Ser 




185 










190 






Lys 


Ser 


His 


Arg 


Lys 


Arg 


Glu 


Thr 


Pro 


Lys 




200 










205 








Ser 


Pro 


Lys 


Arg 


Arg 


Ser 


Arg 


Ser 


Pro 


His 


215 










220 










Ser 


Lys 


Gin 


Asp 


Asp 


Ser 


Pro 


Ser 


Gly 


Ala 








235 










240 


Asp 


Leu 


Ser 


Pro 


Ser 


Arg 


Ser 


His 


Thr 


Ser 






250 










255 




Lys 


Lys 


Ser 


Pro 


Gly 


Ser 


Thr 


Ser 


Arg 


Arg 






265 










270 






Tyr 


Lys 


Glu 


Pro 


Ser 


Ala 


Tyr 


Gin 


Sar 


Ser 




280 










285 








Tyr 


Ser 


Arg 


Arg 


Gin 


Arg 


Ser 


Val 


Ser 


Pro 


2 95 










300 










Ser 


Ser 


Tyr 


Glu 


Arg 


Ser 


Gly 


Ser 


Tyr 


Ser 










315 










320 


Tyr 


Gly 


Arg 


Arg 


Arg 


Ser 


Ser 


Ser 


Pro 


Phe 








330 










335 




Ser 


Arg 


Ser 


Pro 


Leu 


Pro 


Ser 


Arg 


Lys 


Ser 






345 










350 






Ser 


Pro 


Ala 


Tyr 


Ser 


Arg 


His 


Ser 


Ser 


Ser 




360 










365 








Ser 


Ser 


Ser 


Arg 


Ser 


Arg 


His 


Ser 


Ser 


lie 


375 










380 










Leu 


Asn 


Ser 


Ser 


Leu 


Gly 


Ala 


Glu 


Leu 


Ser 






















Ala 


Ala 


Ala 


Ala 


Ala 


Ala 


Ala 


Lys 


Met 


Asp 








410 










415 




Ser 


Pro 


Val 


Phe 


Leu 


Pro 


Arg 




Glu 


Asn 






425 










430 






Asp 


Ser 


Gly 


Leu 


Glu 


Ser 


Lys 




Leu 


Pro 


440 










445 










Ser 


Ala 


Pro 


Asp 


Thr 


Glu 


Leu 


Val 


Asn 


455 










460 











119 



EP 1 365 034 A2 



Val 


Thr 


His 


Leu 


Asn 


Thr 


Glu 


Val 


Lys 


Asn Ser Ser Asp 


Thr 


Gly 




465 










470 








475 








Val 


Lys 


Leu 


Asp 


Glu 


Asn 


Ser 


Glu 


Lys 


His Leu Val Lys 


Asp 


Leu 


ys 






485 










490 




495 




Ala 


Gin 


Gly 


Thr 


Arg 


Asp 


Ser 


Lys 


Pro 


He Ala Leu Lys 


Glu 


Glu 


e 






500 










505 




510 






Val 


Thr 


Pro 


Lys 


Glu 


Thr 


Glu 


Thr 


Ser 


Glu Lys Glu Thr 


Pro 


Pro 


Pro 






515 








520 




525 








Leu 


Pro 


Thr 


He 


Ala 


Ser 


Pro 


Pro 


Pro 


Pro Leu Pro Thr 


Thr 


Thr 


Pro 




530 










535 






540 








Pro 


Pro 


Gin 


Thr 


Pro 


Pro 


Leu 


Pro 


Pro 


Leu Pro Pro He 


Pro 


a 


*"? 


545 










550 








555 






560 


Pro 


Gin 


Gin 


Pro 


Pro 


Leu 


Pro 


Pro 


Ser 


Gin Pro Ala Phe 


Ser 


Gin 


Val 










565 










570 




575 




Pro 


Ala 


Ser 


Ser 


Thr 


Ser 


Thr 


Leu 


Pro 


Pro Ser Thr His 


Ser 


ys 


r 








580 










585 




590 






Sar 


Ala 


Val 


Ser 


Ser 


Gin 


Ala 


Asn 


Ser 


Gin Pro Pro Val 


Gin 


Val 


Sar 






595 










600 




605 








Val 


Lys 


Thr 


Gin 


Val 


Ser 


Val 


Thr 


Ala 


Ala He Pro His 


Leu 


Ly3 


Thr 




610 










615 






620 








Ser 


Thr 


Leu 


Pro 


Pro 


Leu 


Pro 


Leu 


Pro 


Pro Leu Leu Pro 


Gly 


Gly 


Asp 


625 










630 








635 






640 


Asp 


Met 


Asp 


Ser 


Pro 


Lys 


Glu 


Thr 


Leu 


Pro Ser Lys Pro 


Val 




ys 








645 










650 




655 




Glu 


Lys 


Glu 


Gin 


Arg 


Thr 


Arg 


His 


Leu 


Leu Thr Asp Leu 




Leu 


Pro 






660 










665 




670 






Pro 


Glu 


Leu 


Pro 


Gly 


Gly 


Asp 


Leu 


Ser 


Pro Pro Asp Ser 


Pro 


Glu 


Pro 






675 










680 




685 








Lys 


Ala 


He 


Thr 


Pro 


Pro 


Gin 


Gin 


Pro 


Tyr Lys Lys Arg 


Pro 


Lys 


He 


690 










695 






700 








Cys 


Cys 


Pro 


Arg 


Tyr 


Gly 


Glu 


Arg 


Arg 


Gin Thr Glu Ser 


Asp 


Trp 


Gly 


705 








710 








7 ^ 






Ci° 


Lys 


Arg 


Cys 


Val 


Asp 


Lys 


Phe 


Asp 


He 


He Gly He He 


G y 


u 


y 








725 










730 








Thr 


Tyr 


Gly 


Gin 


Val 


Tyr 


Lys 


Ala 


Arg 


Asp Lys Asp Thr 




Glu 


Leu 




740 














7S0 






Val 


Ala 


Leu 


Lys 


Lys 


Val 


Arg 


Leu 


Asp 


Asn Glu Lys Glu 


Gly 


Phe 


Pro 






755 






760 




765 








He 


Thr 


Ala 


He 


Arg 


Glu 


He 


Lys 


He 


Leu Arg Gin Leu 


He 


His 


Arg 




770 










775 














Ser 


Val 


Val 


Asn 


Met 


Lys 


Glu 


He 


Val 


Thr Asp Lys Gin 


Asp 






785 










790 








J 95 ,r 1 tvu 






800 


Asp 


Phe 


Lys 


Lys 


Asp 


Lys 


Gly 


Ala 


Phe 


Tyr Leu Val Phe 


u 




Met 








805 














HI R 




Asp 


His 


Asp 


Leu 


Met 


Gly 


Leu 


Leu 


Glu 


Ser Gly Leu Val 


. 




Ser 






820 










825 




830 






Glu 


Asp 


His 


He 


Lys 


Ser 


Phe 


Met 


Lys 


Gin Leu Met Glu 


Gly 


Leu 


Glu 






835 










840 




845 








Tyr 


Cys 


His 


Lys 


Lys 


Asn 


Phe 


Leu 


His 


Arg Asp He Lys 


Cys 


Ser 


n 


850 










855 














He 


Leu 




Asn 


Asn 


Sar 


Gly 


Gin 


He 


Lys Leu Ala Asp 


Phe 


Gly 


Leu 


865 










870 








875 






880 
Val 


Ala 


Arg 




Tyr 


Asn 


Ser 


Glu 


Glu 


Ser 


Arg Pro Tyr Thr 


Asn 


Lys 






885 










890 




895 




He 


Thr 


Leu 


Trp 


Tyr 


Arg 


Pro 


Pro 


Glu 


Leu Leu Leu Gly 


Glu 


Glu 


Arg 








900 










905 




910 








Thr 


Pro 


Ala 


He 




Val 


Trp 


Ser 


Cys Gly Cys He 


Leu 


Gly 


Glu 




915 










920 




925 
























120 











EP 1 365 034 A2 



Leu Phe Thr Lys Lys Pro He Phe Gin Ala Asn Leu Glu Leu Ala Gin 

930 935 940 

Leu Glu Leu He Ser Arg Leu Cys Gly Ser Pro Cys Pro Ala Val Trp 
945 950 955 960 

Pro Asp Val He Lys Leu Pro Tyr Phe Asn Thr Met Lys Pro Lys Lys 

965 970 975 

Gin Tyr Arg Arg Arg Leu Arg Glu Glu Phe Ser Phe He Pro Ser Ala 

980 985 990 

Ala Leu Asp Leu Leu Asp His Met Leu Thr Leu Asp Pro Ser Lys Arg 

995 1000 1005 

Cys Thr Ala Glu Gin Thr Leu Gin Ser Asp Phe Leu Lys Asp Val 

1010 1015 1020 

Glu Leu Ser Lys Met Ala Pro Pro Asp Leu Pro His Trp Gin Asp 

1025 1030 1035 

Cys His Glu Leu Trp Ser Lys Lys Arg Arg Arg Gin Arg Gin Ser 

1040 1045 1050 

Gly Val Val Val Glu Glu Pro Pro Pro Ser Lys Thr Ser Arg Lys 

1055 1060 1065 

Glu Thr Thr Ser Gly Thr Ser Thr Glu Pro Val Lys Asn Ser Ser 

1070 1075 1080 

Pro Ala Pro Pro Gin Pro Ala Pro Gly Lys Val Glu Ser Gly Ala 

1085 1090 1095 

Gly Asp Ala He Gly Leu Ala Asp He Thr Gin Gin Leu Asn Gin 

1100 H05 HIO 

Ser Glu Leu Ala Val Leu Leu Asn Leu Leu Gin Ser Gin Thr Asp 

1115 1120 H25 

Leu Ser He Pro Gin Met Ala Gin Leu Leu Asn He His Ser Asn 

1130 H35 H40 

Pro Glu Met Gin Gin Gin Leu Glu Ala Leu Asn Gin Ser He Ser 

1145 H50 1155 

Ala Leu Thr Glu Ala Thr Ser Gin Gin Gin Asp Ser Glu Thr Met 

1160 H65 H70 

Ala Pro Glu Glu Ser Leu Lys Glu Ala Pro Ser Ala Pro Val He 

1175 H80 H85 

Leu Pro Ser Ala Glu Gin Met Thr Leu Glu Ala Ser Ser Thr Pro 

1190 1195 1200 

Ala Asp Met Gin Asn He Leu Ala Val Leu Leu Ser Gin Leu Met 

1205 1210 1215 

Lys Thr Gin Glu Pro Ala Gly Ser Leu Glu Glu Asn Asn Ser Asp 

1220 1225 1230 

Lys Asn Ser Gly Pro Gin Gly Pro Arg Arg Thr Pro Thr Met Pro 

1235 1240 1245 

Gin Glu Glu Ala Ala Ala Cys Pro Pro His He Leu Pro Pro Glu 

1250 1255 1260 

Lys Arg Pro Pro Glu Pro Pro Gly Pro Pro Pro Pro Pro Pro Pro 

1265 1270 1275 

Pro Pro Leu Val Glu Gly Asp Leu Ser Ser Ala Pro Gin Glu Leu 

1280 1285 1290 

Asn Pro Ala Val Thr Ala Ala Leu Leu Gin Leu Leu Ser Gin Pro 

1295 1300 1305 

Glu Ala Glu Pro Pro Gly His Leu Pro His Glu His Gin Ala Leu 

1310 1315 1320 

Arg Pro Met Glu Tyr Ser Thr Arg Pro Arg Pro Asn Arg Thr Tyr 

1325 1330 1335 

Gly Asn Thr Asp Gly Pro Glu Thr Gly Phe Ser Ala He Asp Thr 

1340 1345 1350 

Asp Glu Arg Asn Ser Gly Pro Ala Leu Thr Glu Ser Leu Val Gin 

1355 1360 1365 



121 



EP 1 365 034 A2 





Thr 


Leu 
1370 


Val Lys Asn Arg Thr 
1375 


Phe 


er 


Gly 


^ 1380 


er 


His 






Gly 


Glu 


Ser Ser Ser Tyr Gin 


Gly 


r 


y 


VI 
^ 1395 


Gin 


Phe 


Pro 


5 


1385 


1390 


















Gly 


Asp 


Gin Asp Leu Arg; Phe 


a 


Arg 


V 


P Le 
r ° 1410 


Ala 


Leu 


His 




1400 






















Val 


Val Gly Gin Pro Phe 


u 


yS 


Ala 


Glu Gly 
1425 


Ser 


Ser 


Asn 


10 




Val 5 


Val His A3 a Glu Thr 


Lys 


Leu 


Gin 




Gly 


Glu 


Leu 




1430 


1435 








1440 














Gly Thr Thr Gly Ala 


Ser 


Ser 


Ser 


Gly Ala 


Gly 


Leu 


His 




1445 


1450 








1455 










Trp 


Gly 


Gly Pro Thr Gin Ser 


Ser 


Ala 


Tyr 


Gly Lys 


Leu 


Tyr 


Arg 




1460 


1465 








1470 








15 


Gly 


Pro 


Thr Arg Val Pro Pro 


Arg 


Gly 


Gly 


Arg Gly 


Arg 


Gly 


Val 




1475 


1480 








1485 










Pro 


Tyr 
1490 



















<211> 381 
<212> PRT 



<400> 32 






















Met 


Leu 


Thr 


Arg 


Leu Phe Ser 


Glu Pro 


Gly 


Leu 


Leu 


Ser 


Asp 


Val 


Pro 


1 






5 




10 










15 




Lys 


Phe 


Ala 


Ser 


Trp Gly Asp 


Gly Glu 


Asp 


Asp 


Glu 


Pro 


Arg 


Ser 


Asp 






20 


25 










30 






Lys 


Gly 


Asp 


Ala 


Pro Pro Pro 


Pro Pro 


Pro 


Ala 


Pro 


Gly 


Pro 


Gly 


Ala 


35 






40 








45 








Pro 


Gly 
50 


Pro 


Ala 


Arg Ala Ala 
55 


Lys Pro 


Val 


Pro 


Leu 
60 


Arg 


Gly 


Glu 


Glu 


Gly 


Thr 


Glu 


Ala 


Thr Leu Ala 


Glu Val 




Glu 


Glu 


Gly 


Glu 


Leu 


Gly 


65 








70 






75 










80 


Gly 


Glu 


Glu 


Glu 


Glu Glu Glu 


Glu Glu 


Glu 


Glu 


Gly 


Leu 


Asp 


Glu 


Ala 








85 




90 










95 




Glu 


Gly 


Glu 


Arg 


Pro Lys Lys 


Arg Gly 


Pro 


Lys 


Lys 


Arg 


Lys 


Met 


Thr 






100 




105 










110 






Lys 


Ala 


Arg 


Leu 


Glu Arg Ser 


Lys Leu 


Arg 


Arg 


Gin 


Lys 


Ala 


Asn 


Ala 




115 






120 








125 








Arg 


Glu 


Arg 


Asn 


Arg Met His 


Asp Leu 


Asn 


Ala 


Ala 


Leu 


Asp 


Asn 


Leu 


130 




135 








140 










Arg 




Val 


Val 


Pro Cys Tyr 


Ser Lys 


Thr 


Gin 


Lys 


Leu 


Ser 


Lys 


He 


145 






150 






155 










160 


Glu 


Thr 


Leu 


Arg 


Leu Ala Lys 


Asn Tyr 


He 


Trp 


Ala 


Leu 


Ser 


Glu 


He 








165 




170 










175 




Leu 


Arg 


Ser 


Gly 


Lys Arg Pro 


Asp Leu 


Val 


Ser 


Tyr 


Val 


Gin 


Thr 


Leu 






180 




185 










190 






Cys 




Gly 


Leu 


Ser Gin Pro 


Thr Thr 


Asn 




Val 


Ala 


Gly 


Cys 


Leu 


195 






200 








205 








Gin 


210 


Asn 


Ser 


Arg Asn Phe 
215 




Glu 


Gin 


Gly 
220 


Ala 


Asp 


Gly 


Ala 



122 



EP 1 365 034 A2 



Gly 


Arg 


Phe 


His Gly Ser Gly 


y 


Pro Phe Ala Met 


His Pro 


Tvr 


Pro 


225 














240 


Tyr 


Pro 


Cys 


Ser Arg Leu Ala 


y 


Al Gl Cvs Gin 
* 250 n 


Ala Ala 


Gly 


Gly 




245 








255 




Leu 


Gly 


Gly 


Gly Ala Ala His 


Ala 


Leu g r is 


Gl Tvr 
Y 270 


Cys 


Ala 




260 




f 








Ala 


Tyr 


Glu 


Thr Leu Tyr Ala 




Ala Gly Gly Gly 


285 * 


r 
er 


Pro 




275 




580 










Asp 


Tyr 


Asn 


Ser Ser Glu Tyr 


Glu 


Gly Pro Leu Ser 


Pro Pro 


Leu 


Cys 


290 




295 




300 








Leu 


Asn 


Gly 


Asn Phe Ser Leu 


Lys 


Gin Asp Ser Ser 


Pro Asp 






305 














320 




Ser 


Tyr 


His Tyr Ser Met 


His 


Tyr Ser Ala Leu 


Pro Gly 


Ser 


Arg 




325 




330 




335 




His 


Gly 


His 


Gly Leu Val Phe 


Gly 


Ser Ser Ala Val 


Arg Gly 


Gly 


Val 






340 




345 


350 






His 


Ser 


Glu 
355 


Asn Leu Leu Ser 


Tyr 
360 


Asp Met His Leu 


His His 
365 


Asp 


Arg 


Gly 


Pro 


Met 


Tyr Glu Glu Leu 


Asn 


Ala Phe Phe His 


Asn 






370 




375 




380 









<210> 33 
<211> 445 



25 <212> PRT 



<400> 33 














Met 


Ser 


Lys 


Leu Pro Arg Glu 


Leu 


Thr Arg Asp Leu 


Glu Arg 


Ser 


Leu 


1 




5 




10 




15 




Pro 


Ala 


Val 


Ala Ser Leu Gly 


Ser 


Ser Leu Ser His 


Ser Gin 


Ser 


Leu 








20 




25 


30 






Ser 


Ser 


His 


Leu Leu Pro Pro 


Pro 


Glu Lys Arg Arg 


Ala He 


Ser 


Asp 






35 




40 




45 






Val 


Arg 


Arg 


Thr Phe Cys Leu 


Phe 


Val Thr Phe Asp 


Leu Leu 


Phe 


He 




50 


55 




60 








Ser 


Leu 


Leu 


Trp He He Glu 


Leu 


Asn Thr Asn Thr 


Gly He 


Arg 


Lys 


65 






70 




75 






80 


Asn 


Leu 


Glu 


Gin Glu He He 


Gin 


Tyr Asn Phe Lys 




Phe 


Phe 








85 




90 




95 




Asp 


He 


Phe 


Val Leu Ala Phe 


Phe 


Arg Phe Ser Gly 


Leu Leu 


Leu 


Gly 






100 




105 


110 








Ala 


Val 


Leu Gin Leu Arg 


His 


Trp Trp Val He 


Ala Val 


Thr 


Thr 




115 




120 




125 






Leu 


Val 


Ser 


Ser Ala Phe Leu 


He 


Val Lys Val He 


Leu Ser 


Glu 


Leu 




130 




135 




140 








Leu 


Ser 


Lys 


Gly Ala Phe Gly 


Tyr 


Leu Leu Pro He 


Val Ser 


Phe 


Val 


145 




150 




155 






160 


Leu 


Ala 


Trp 


Leu Glu Thr Trp 


Phe 


Leu Asp Phe Lys 


Val Leu 


Pro 


Gin 






165 




170 




175 




Glu 


Ala 


Glu 


Glu Glu Arg Trp 


Tyr 


Leu Ala Ala Gin 


Val Ala 


Val 


Ala 








180 




185 


190 






Arg 


Gly 


Pro 


Leu Leu Phe Ser 


Gly 


Ala Leu Ser Glu 


Gly Gin 


Phe 


Tyr 


195 




200 




205 







123 



EP 1 365 034 A2 





Ser 


Pro 


Pro 


Glu 


Ser 


Phe 


Ala Gly 


Ser 


Asp 


Asn Glu 


Ser 


Asp 


Glu 






210 










215 






220 










Val 


Ala 


Gly 


Lys 


Lys 


Ser 


Phe Ser 


Ala 


Gin 


ft* 


Glu 


Tyr 


I e 


5 


225 








230 


















Gin 


Gly 


Lys 


Glu 


Ala 


Thr 


Ala Val 


Val 


Asp 


i 

Gin He 


Leu 


a 










245 








250 








255 




Glu 


Asn 


Trp 


Lys 


Phe 


Glu 


Lys Asn 


Asn 


Glu 


Tyr Gly 


Asp 




V 








260 








265 








970 




10 


Thr 


lie 


Glu 


Val 


Pro 


Phe 


His Gly 


Lys 


Thr 


Phe He 




Lys 


T 






275 








280 








285 








Leu 


Pro 


Cys 


Pro 


Ala 


Glu 


Leu Val 


Tyr 


Gin 




He 


Leu 


Gin 






290 








295 
















Glu 


Arg 


Met 


Val 


Leu 


Trp 


Asn Lys 


Thr 


V 1 


Th Ala 


Cys 


Gin 


He 




305 








310 








315 








15 


Gin 


Arg 


Val 


Glu 


Asp 


Asn 


Thr Leu 


He 


Ser 


Tyr Asp 


Val 


Ser 












325 








330 












Ala 


Ala 


Gly 


Gly 


Val 


Val 


Ser Pro 


Arg 


Asp 


Phe Val 


Asn 


* 


Arg 








340 








345 








35 






lie 


Glu 


Arg 


Arg 


Arg 


Asp 


Arg Tyr 


Leu 


Ser 


Ser Gly 




Ala 


Thr 


20 






355 








360 








365 






His 


Ser 
370 


Ala 


Lys 


Pro 


Pro 


Thr His 
375 


Lys 


Tyr 


V 1 Arg 
380 


Gly 


Glu 


Asn 




























Val 




385 






390 








395 










Thr 


Phe 


Val 


Trp 


He 


Leu 


Asn Thr 


Asp 


Leu 


Lys Gly 


Arg 


Leu 


Pro 


25 








405 








410 








415 




Tyr 


Leu 


lie 


His 


Gin 


Ser 


Leu Ala 


Ala 


Thr 


Met Phe 


Glu 


Phe 


Ala 








420 








425 








430 






His 


Leu 


Arg 
435 


Gin 


Arg 


He 


Ser Glu 
440 


Leu 


Gly 


Ala Arg 


Ala 
445 







<210> 34 

<211> 167 

<212> PRT 

35 

<213> Homo sapiens 



Met 


Ala 


Thr 


Ser 


Glu 


Leu 


Ser Cys 


Glu 


Val 


Ser Glu 


Glu 


Asn 


Cys 


Glu 


1 








5 








10 








15 




Arg 


Arg 


Glu 


Ala 


Phe 


Trp 


Ala Glu 


Trp 


Lys 


Asp Leu 


Thr 


Leu 


Ser 


Thr 




20 








25 








30 






Arg 


Pro 


Glu 


Glu 


Gly 


Cys 


Ser Leu 


His 


Glu 


Glu Asp 


Thr 


Gin 


Arg 


His 




35 








40 








45 








Glu 


Thr 


Tyr 


His 


Gin 


Gin 


Gly Gin 


Cys 


Gin 


Val Leu 


Val 


Gin 


Arg 


Ser 




50 








55 






60 










Pro 


Trp 


Leu 


Met 


Met 


Arg 


Met Gly 


He 


Leu 


Gly Arg 


Gly 


Leu 


Gin 


Glu 


65 








70 








75 








80 


Tyr 


Gin 


Leu 


Pro 


Tyr 


Gin 


Arg Val 


Leu 


Pro 


Leu Pro 


He 


Phe 


Thr 


Pro 








85 








90 








95 




Ala 


Lys 


Met 


Gly 


Ala 


Thr 


Lys Glu 


Glu 


Arg 


Glu Asp 


Thr 


Pro 


He 


Gin 






100 








105 








110 






Leu 


Gin 


Glu 
115 


Leu 


Leu 


Ala 


Leu Glu 
120 


Thr 


Ala 


Leu Gly 


Gly 
125 


Gin 


Cys 


Val 



124 



EP 1 365 034 A2 



Leu Pro Pro Val Val 
140 

Leu Ser Arg Ser Met 
160 



<210> 35 

10 

<211> 282 

<212> PRT 

<213> Homo sapiens 



Asp Arg Gin Glu Val Ala Glu lie Thr Lys Gin 

130 135 
Pro Val Ser Lys Pro Gly Ala Leu Arg Arg Ser 
145 150 155 

Ser Gin Glu Ala Gin Arg Gly 
165 



<400> 35 


















Met 


Ser 


Gly 


Ala 


Asp Arg Ser Pro Asn Ala 


Gly 


Ala 


Ala 


Pro 


Asp 


Ser 


1 






5 10 










15 




Ala 


Pro 


Gly 


Gin 


Ala Ala Val Ala Ser Ala 


Tyr 


Gin 


Arg 


Phe 


Glu 


Pro 






20 


25 








30 






Arg 


Ala 


Tyr 


Leu 


Arg Asn Asn Tyr Ala Pro 


Pro 


Arg 


Gly 


Asp 


Leu 


Cys 




35 




40 






45 








Asn 


Pro 
50 


Asn 


Gly 


Val Gly Pro Trp Lys Leu 
55 


Arg 


Cys 
60 


Leu 


Ala 


Gin 


Thr 


Phe 


Ala 


Thr 


Gly 


Glu Val Ser Gly Arg Thr 


Leu 


He 


Asp 


He 


Gly 


Ser 


65 






70 


75 










80 


Gly 


Pro 


Thr 


Val 


Tyr Gin Leu Leu Ser Ala 


Cys 


Ser 


His 


Phe 


Glu 


Asp 








85 90 














He 


Thr 


Met 


Thr 
100 


Asp Phe Leu Glu Val Asn 
105 


Arg 


Gin 


Glu 


Leu 
110 


Gly 


Arg 


Trp 


Leu 


Gin 


Glu 


Glu Pro Gly Ala Phe Asn 


Trp 


Ser 


Met 


Tyr 


Ser 


Gin 




115 




120 






125 








His 


Ala 


Cys 


Leu 


He Glu Gly Lys Gly Glu 


Cys 


Trp 


Gin 


Asp 




Glu 




130 




135 




140 










Arg 


Gin 




Arg 


Ala Arg Val Lys Arg Val 


Leu 


Pro 


He 


Asp 


Val 


His 


145 






150 


155 










160 


Gin 


Pro 


Gin 


Pro 


Leu Gly Ala Gly Ser Pro 
165 170 


Ala 


Pro 


Leu 


Pro 


Ala 
175 


Asp 


Ala 


Leu 


Val 


Ser 
180 


Ala Phe Cys Leu Glu Ala 
185 


Val 


Ser 


Pro 


Asp 
190 


Leu 


Ala 


Ser 


Phe 


Gin 


Arg 


Ala Leu Asp His He Thr 


Thr 


Leu 


Leu 


Arg 


Pro 


Gly 






195 


200 






205 








Gly 


His 


Leu 


Leu 


Leu He Gly Ala Leu Glu 


Glu 


Ser 


Trp 


Tyr 


Leu 


Ala 


210 






215 




220 










Gly 


Glu 


Ala 


Arg 


Leu Thr Val Val Pro Val 


Ser 


Glu 


Glu 


Glu 


Val 


Arg 


225 






230 


235 










240 


Glu 


Ala 


Leu 


Val 


Arg Ser Gly Tyr Lys Val 
245 250 


Arg 


Asp 


Leu 


Arg 


Thr 
255 


Tyr 


He 


Met 


Pro 


Ala 
260 


His Leu Gin Thr Gly Val 
265 


Asp 


Asp 


Val 


Lys 
270 


Gly 


Val 


Phe 


Phe 


Ala 
275 


Trp 


Ala Gin Lys Val Gly Leu 
280 















55 



125 



EP 1 365 034 A2 



<210> 36 

<211> 1255 

<212> PRT 

<213> Homo sapiens 



<400> 36 




















Met 


Glu 


Leu 


Ala 


Ala Leu Cys 


Arg Trp Gly Leu 


Leu 


Leu 


Ala 


Leu 


Leu 


1 








5 




10 








15 




Pro 


Pro 


Gly 


Ala 


Ala Ser Thr 


Gin 


Val Cys Thr 


Gly 


Thr 


Asp 


Met 


Lys 






20 






25 






30 






Leu 


Arg 


Leu 


Pro 


Ala Ser Pro 


Glu 


Thr His Leu 


Asp 


Met 


Leu 


Arg 


His 




35 






40 






45 








Leu 


Tyr 


Gin 


Gly 


Cys Gin Val 


Val 


Gin Gly Asn 


Leu 


Glu 


Leu 


Thr 


Tyr 




50 




55 






60 










Leu 


Pro 


Thr 


Asn 


Ala Ser Leu 


Ser 


Phe Leu Gin 


Asp 


He 


Gin 


Glu 


an 1 


65 








70 




75 












Gin 


Gly 


Tyr 


Val 


Leu He Ala 


His 


Asn Gin Val 


Arg 


Gin 


Val 


Pro 


Leu 






85 




90 








95 




Gin 


Arg 


Leu 


Arg 


He Val Arg 


Gly 


Thr Gin Leu 


Phe 


Glu 


Asp 


Asn 


Tyr 






100 






105 






110 






Ala 


Leu 


Ala 


Val 


Leu Asp Asn 


Gly Asp Pro Leu 


Asn 


Asn 


Thr 










115 






120 






125 








Val 


Thr 


Gly 


Ala 


Ser Pro Gly 


Gly Leu Arg Glu 


Leu 






Arg 






130 




135 






140 










Leu 


Thr 


Glu 


He 


Leu Lys Gly 


Gly Val Leu He 


Gin 


Arg 


Asn 


Pro 




145 








150 




155 










160 


Leu 


Cys 


Tyr 


Gin 


Asp Thr He 


Leu Txp Lys Asp 


He 


Phe 






n 






165 




170 








175 




Asn 


Gin 


Leu 


Ala 


Leu Thr Leu 


He Asp Thr Asn 


Arg 


S 


Arg 


Ala 


Cys 








180 






185 






190 






His 


Pro 


Cys 
195 


Ser 


Pro Met Cys 


Lys 
200 


Gly Ser Arg 


Cys 


Trp 


Gly 


Glu 


Ser 


Ser 


Glu 


Asp 


Cys 


Gin Ser Leu 


Thr Arg Thr Val 


Cys 


Ala 


Gly 


Gly 


Cys 




210 


215 






220 










Ala 


Arg 


Cys 


Lys 


Gly Pro Leu 


Pro 


Thr Asp Cys 


Cys 


His 


Glu 


Gin 


Cys 




230 




235 












Ala 


Ala 


Gly 






Lys 


His Ser Asp 


Cys 


Leu 


Ala 


Cys 


Leu 






245 




250 








255 




His 


Phe 


Asn 


His 
260 


Ser Gly He 


Cys 


Glu Leu His 

265 


Cys 


Pro 


Ala 
270 


Leu 


Val 


Thr 


Tyr 


Asn 
275 


Thr 


Asp Thr Phe 


Glu 
280 


Ser Met Pro 


Asn 


Pro 
285 


Glu 


Gly 


Arg 


Tyr 


Thr 


Phe 


Gly 


Ala Ser Cys 


Val 


Thr Ala Cys 


Pro 


Tyr 


Asn 


Tyr 


Leu 


290 




295 






300 










Ser 


Thr 


Asp 


Val 


Gly Ser Cys 


Thr 


Leu Val Cys 


Pro 


Leu 


His 


Asn 


Gin 


305 






310 




315 










320 


Glu 


Val 


Thr 


Ala 


Glu Asp Gly 
325 


Thr 


Gin Arg Cys 
330 


Glu 


Lys 


Cys 


Ser 
335 


Lys 


Pro 


Cys 


Ala 


Arg 


Val Cys Tyr 


Gly 


Leu Gly Met 


Glu 


His 


Leu 


Arg 


Glu 






340 






345 






350 






Val 


Arg 


Ala 


Val 


Thr Ser Ala 


Asn 


He Gin Glu 


Phe 


Ala 


Gly 


Cys 


Lys 




355 






360 






365 








Lys 


lie 


Phe 


Gly 


Ser Leu Ala 


Phe 


Leu Pro Glu 


Ser 


Phe 


Asp 


Gly 


Asp 


370 




375 






380 











126 



EP 1 365 034 A2 



Pro Ala 


Ser 


Asn Thr Ala Pro 


Leu 


Gin 


Pro 


■aoc G n 


u n 


a 


Phe 


385 




390 








395 








Glu Thr 


Leu 


Glu Glu He Thr 


Gly 


Tyr 


Leu 


Tyr I e 


Al 
er a 


f*? 


Pro 






405 






410 










Asp Ser 


Leu 


Pro Asp Leu Ser 


Val 


Phe 


Gin 


Asn Leu 


Gin Val 




Arg 




420 




425 






430 






Gly Arg 


He 


Leu His Asn Gly 


Ala 


Tyr 


Ser 


Leu Thr 


^5 


Gly 


U 


435 




440 














Gly He 


Ser 


Trp Leu Gly Leu 


Arg 


Ser 


Leu 


Arg Glu 


Leu Gly 


Ser 


Gly 


450 




455 








460 








Leu Ala 


Leu 


He His His Asn 


Thr 


His 


Leu 


Cys Phe 


Val is 


E 


V 


465 




470 














480 


Pro Trp 


Asp 


Gin Leu Phe Arg 


Asn 


Pro 


His 


Gin Ala 


Leu Leu 




r 


485 






490 






495 




Ala Asn 


Arg 


Pro Glu Asp Glu 


Cys 


Val 


Gly 


Glu Gly 


T 

Leu Ala 


Cys 


is 




500 




505 






510 






Gin Leu 


Cys 


Ala Arg Gly His 


Cys 


Trp 


Gly 


Pro Gly 


Pro Thr 


Gin 


Cys 




515 




520 








? 






Val Asn 


Cys 


Ser Gin Phe Leu 


Arg 


Gly 


Gin 


Glu Cys 


Val Glu 


Glu 


Cys 


530 


535 








540 








Arg Val 


Leu 


Gin Gly Leu Pro 


Arg 


Glu 


Tyr 


555 


Ala Arg 


His 


Cys 


545 




550 














rv° 


Leu Pro 


Cys 


His Pro Glu Cys 


Gin 


Pro 


Gin 


Asn Gly 


VI 
er a 




■ys 




565 






570 










Phe Gly 


Pro 


Glu Ala Asp Gin 


Cys 


Val 


Ala 


Cys Ala 


18 590 


ys 


As 

P 




580 




585 












Pro Pro 


Phe 


Cys Val Ala Arg 


Cys 


Pro 


Ser 


Gly Val 


Lys Pro 


Asp 


U 




595 


600 








605 






Ser Tyr 


Met 


Pro He Trp Lys 


Phe 


Pro 


Asp 


Glu Glu 


Gly Ala 


Cys 




610 




615 








620 








Pro Cys 


Pro 


He Asn Cys Thr 


His 


Ser 


Cys 


Val Asp 


T 

Leu Asp 


Asp 




625 




630 








635 








Gly Cys 


Pro 


Ala Glu Gin Arg 


Ala 


Ser 


Pro 


Leu Thr 


Ser I e 


a 


er 




645 






650 










Ala Val 


Val 


Gly He Leu Leu 


Val 




Val 


Leu Gly 


V 670 




Y 






660 




665 












He Leu 


He 


Lys Arg Arg Gin 


Gin 


Lys 


lis 


Arg Lys 






9 




675 


680 








685 






Arg Leu 
690 


Leu 


Gin Glu Thr Glu 


Leu 


Val 


Glu 


° 7nn 


r ro 


Ser 
er 


Glv 




695 
















Ala Met 


Pro 


Asn Gin Ala Gin 


Met 


Arg 


He 


Leu Lys 


u r 


Gl 

U 


Leu 


705 




710 














720 


Arg Lys 


Val 


Lys Val Leu Gly 


Ser 


Gly 


Ala 


Phe Gly 


Thr Val 




ys 




725 






730 










Gly He 


Trp 


He Pro Asp Gly 


Glu 


Asn 


Val 


Lys He 


Pr ° ^ a 


3 


He 


740 
















Lys Val 


Leu 


Arg Glu Asn Thr 


Ser 


Pro 


Lys 


Ala Asn 


Lys Glu 


I e 


Leu 


755 




760 








V 






Asp Glu 


Ala 


Tyr Val Met Ala 


Gly 


Val 


Gly 




Tyr a 


er 


Arg 


770 












780 










Gl 


le Cys Leu Thr 


Ser 


Thr 


Val 


Gin Leu 


Val Thr 


Gin 


Leu 


785 


790 








795 






800 


Met Pro 




Gly Cys Leu Leu 
805 


Asp 


His 


Val 


Arg Glu 


Asn Arg 


Gly 


Arg 








810 






815 




Leu Gly 


Ser 


Gin Asp Leu Leu 


Asn 


Trp 


Cys 


Met Gin 


He Ala 


Lys 


Gly 




820 




825 






830 




Ala 


Met Ser 


Tyr 


Leu Glu Asp Val 


Arg 


Leu 


Val 


His Arg 


Asp Leu 


Ala 




835 




840 








845 







127 



EP 1 365 034 A2 



Arg 


Asn Val 


Leu 


Val 


Lys 


Ser 


Pro 


Asn 


His 


Val 




He 


r 


Asp 


Phe 


850 








855 




















Gly 


Leu Ala 


Arg 


Leu 




Asp 


He 


Asp 


Glu 




u 


Tyr 


is 


Ala 


Asp 


865 








870 










875 










880 


Gly 


Gly Lys 


Val 




lie 


Lys 


Trp 






U 


Gl 

U 


Ser 
er 


He 


Leu Arg 




885 










890 










895 




Arg 


Arg Phe 


Thr 


Hie 


Gin 


Ser 


Asp 


V 

a 


rp 


er 


Tyr 


Gl 


Val 


Thr 


Val 


900 




















910 






Trp 


Glu Leu 


Met 


Thr 


Phe 


Gly 




Lys 


Pro 


Tyr 


Asp 


Gly 


.. 

e 


Pro 


Ala 


915 










920 


















Arg 


Glu lie 


Pro 


Asp 


Leu 




G u 


Lys 


Gly 


Glu 


Arg 


Leu 


Pro 


Gin 


Pro 


930 








935 










940 










Pro 


He Cys 


Thr 


He 


Asp 


Val 


Tyr 


Met 


He 


Met 


Val 


Lys 


Cys 


Trp 


Met 
960 


945 






950 










955 










lie 


Asp Ser 


Glu 


Cys 


Arg 


Pro 


Arg 


Phe 


Arg 


Glu 


Leu 


Val 


Ser 


Glu 


Phe 






965 










970 










975 




Ser 


Arg Mat 


Ala 


Arg 


Asp 


Pro 


Gin 


Arg 


Phe 


Val 


Val 


He 


Gin 


Asn 


Glu 




960 










985 










990 






Asp 


Leu Gly 


Pro 


Ala 


Ser 


Pro 


Leu 


Asp Ser Thr Ph« 


i Tyr Arg St 


jr L« 



995 1000 1005 







Asp 


Asp 


Asp 










P 1020 










1010 










1015 














Leu 


Val 


Pro 


Gin 




Gly 




Phe 


Cys Pro 


ASf> 1035 






Gly 




























Al 


1040 


Gly 


Met 




His 


His 
1045 


Arg 


His Arg 
Gly Le! 


Ser Ser 
1050 


Ser 


Thr 


Arg 




1055 




Gly 


Asp 


Leu 


Thr 
1060 


Leu 




Glu Pro 
1065 


Ser 


Glu 


Glu 


Glu 


Ala 


Pro 


Arg 


Ser 


Pro 


Leu 


Ala 


Pro Ser 


Glu Gly 


Ala 


Gly 


Ser 




1070 








1075 






1080 








Asp 


Val 


Phe 


Asp 


Gly 


Asp 


Lau 


Gly 


Met Gly 


Ala Ala 


Lys 


Gly 


Lau 


1085 










1090 






1095 








Gin 


Ser 
1100 


Leu 


Pro 


Thr 


His 


Asp 
1105 


Pro 


Ser Pro 


Leu Gin 
1110 


Arg 


Tyr 


Ser 


Glu 


Asp 
1115 


Pro 


Thr 


Val 


Pro 


Leu 
1120 


Pro 


Ser Glu 


Thr Asp 
1125 


Gly 


Tyr 


Val 


Ala 


Pro 


Leu 


Thr 


Cys 


Ser 


Pro 


Gin 


Pro Glu 


Tyr Val 


Asn 


Gin 


Pro 




1130 








1135 






1140 








Asp 


Val 


Arg 


Pro 


Gin 


Pro 


Pro 


Ser 


Pro Arg 


Glu Gly 


Pro 


Leu 


Pro 


1145 








1150 






1155 








Ala 


Ala 


Arg 


Pro 


Ala 


Gly 


Ala 


Thr 


Leu Glu 


Arg Ala 


Lys 


Thr 






1160 










1165 






1170 








Ser 


Pro 
1175 


Gly 


Lys 


Asn 


Gly 


Val 
1180 


Val 


Lys Asp 


Val Phe 
1185 


Ala 


Phe 


Gly 


Gly 


Ala 


Val 


Glu 


Asn 


Pro 


Glu 


Tyr 


Leu Thr 


Pro Gin 


Gly 


Gly 


Ala 


1190 










1195 






1200 








Ala 


Pro 
1205 


Gin 


Pro 


His 


Pro 


Pro 
1210 


Pro 


Ala Phe 


Ser Pro 
1215 


Ala 


Phe 


Asp 


Asn 


Leu 
1220 




Tyr 


Trp 


Asp 


Gin 
1225 


Asp 


Pro Pro 


Glu Arg 
1230 


Gly 


Ala 


Pro 


Pro 


Ser 


Thr 


Phe 


Lys 


Gly 


Thr 


Pro 


Thr Ala 


Glu Asn 


Pro 


Glu 


Tyr 




1235 








1240 






1245 








Leu 


Gly 
1250 


Leu 


Asp 


Val 


Pro 


Val 
1255 















55 



128 



EP 1 365 034 A2 



<210> 37 

<211> 532 

<212> PRT 

<213> Homo sapiens 



<400> 37 




















Mat 


Glu 


Leu 


Asp 


Leu Ser Pro Pro 


His 


Leu 


Ser Ser 


Ser 


Pro 


Glu 


Asp 


1 






5 




10 








15 




Leu 


Trp 


Pro 


Ala 


Pro Gly Thr Pro 


Pro 


Gly 


Thr Pro 


Arg 


Pro 


Pro 


Asp 






20 




25 








30 






Thr 


Pro 


Leu 

35 


Pro 


Glu Glu Val Lys 
40 


Arg 


Ser 


Gin Pro 


Lau 
45 


Leu 


He 


Pro 


Thr 


Thr 


Gly 


Arg 


Lys Leu Arg Glu 


Glu 


Glu 


Arg Arg 


Ala 


Thr 


Ser 


Leu 




50 




55 






60 










Pro 


Ser 


He 


Pro 


Asn Pro Phe Pro 


Glu 


Leu 


Cys Ser 


Pro 


Pro 


Ser 


Gin 


65 








70 






75 








80 


Ser 


Pro 


He 


Leu 


Gly Gly Pro Ser 
85 


Ser 


Ala 


Arg Gly 


Leu 


Leu 


Pro 


Arg 












90 








95 




Asp 


Ala 


Sar 


Arg 


Pro His Val Val 


Lys 


Val 


Tyr Ser 


Glu 


Asp 


Gly 


Ala 






100 




105 








110 






Cys 


Arg 


Sar 


Val 


Glu Val Ala Ala 


Gly 


Ala 


Thr Ala 


Arg 


His 


Val 


Cys 


115 




120 








125 








Glu 


Mat 
130 


Leu 


Val 


Gin Arg Ala His 
135 


Ala 


Leu 


Ser Asp 
140 


Glu 


Thr 


Trp 


Gly 


Leu 


Val 


Glu 


Cys 


His Pro His Leu 


Ala 


Leu 


Glu Arg 


Gly 


Leu 


Glu 


Asp 


145 






150 






155 








160 


His 


Glu 


Sar 


Val 


Val Glu Val Gin 
165 


Ala 


Ala 
170 


Trp Pro 


Val 


Gly 


Gly 
175 


Asp 


Ser 


Arg 


Phe 


Val 


Phe Arg Lys Asn 


Phe 


Ala 


Lys Tyr 


Glu 


Leu 


Phe 


Lys 






180 




185 








190 






Ser 


Sar 


Pro 
195 


His 


Ser Leu Phe Pro 
200 


Glu 


Lys 


Met Val 


Ser 

205 


Ser 


Cys 


Leu 


Asp 


Ala 


His 


Thr 


Gly He Ser His 


Glu 


Asp 


Leu He 


Gin 


Asn 


Phe 


Leu 


210 






215 






220 










Asn 


Ala 


Gly 


Ser 


Phe Pro Glu He 


Gin 


Gly 


Phe Leu 


Gin 


Leu 


Arg 


Gly 


225 






230 






235 








240 


Ser 


Gly 


Arg 


Lys 


Leu Trp Lys Arg 


Phe 


Phe 


Cys Phe 




Arg 


Arg 










245 




250 








255 




Gly 


Leu 


Tyr 


Tyr 


Ser Thr Lys Gly 


Thr 


Ser 




Pro 


Arg 


His 


Leu 




260 




265 








270 






Gin 


Tyr 


Val 


Ala 


Asp Val Asn Glu 


Ser 


Asn 


Val Tyr 


Val 


Val 


Thr 


Gin 




275 




280 








285 








Gly 


Arg 
290 


Lys 


Leu 


Tyr Gly Met Pro 
295 


Thr 


Asp 


Phe Gly 
300 


Phe 


Cys 


Val 


Lys 


Pro 


Asn 


Lys 


Leu 


Arg Asn Gly His 


Lys 


Gly 


Leu Arg 


He 


Phe 


Cys 


Ser 
320 


305 






310 






315 








Glu 


Asp 


Glu 


Gin 


Ser Arg Thr Cys 


Trp 


Leu 


Ala Ala 


Phe 


Arg 


Leu 


Phe 








325 




330 








335 




Lys 


Tyr 


Gly 


Val 


Gin Leu Tyr Lys 


Asn 


Tyr 


Gin Gin 


Ala 


Gin 


Ser 


Arg 


340 




345 








350 






His 


Lau 


His 
355 


Pro 


Ser Cys Leu Gly 
360 


Ser 


Pro 


Pro Leu 


Arg 
365 


Ser 


Ala 


Ser 


Asp 


Asn 


Thr 




Val Ala Met Asp 


Phe 


Ser 


Gly His 


Ala 


Gly 


Arg 


Val 


370 






375 






380 











129 



EP 1 365 034 A2 



He Glu 


Asn 


Pro 


Arg 




Ala 


Leu 


Ser Val 




Leu 


U 


Gl 

U 


Ala 
a 


Gin 


385 






390 








395 










400 


Ala Txp 


Arg 


Lys 


Lys 


Thr 


Asn 


His 




er 


u 


ro 


Met 


Pro 






405 








g 410 










415 




Ser Gly 


Thr 


Ser 


Leu 


Ser 


a 


a 


11 His 




Thr 


Gin 


Leu 


Trp 


Phe 




420 


















430 






His Gly 


Arg 


He 


Ser 


Arg 


Glu 


Gl 


Ser Gin 
er 


Arg 


Leu 


He 


Gl 


Gin 


Gin 


435 










440 








445 








Gly Leu 


Val 


Asp 


Gly 


Leu 


Phe 


Leu 


Val Arg 


Glu 


Ser 


Gin 


Arg 


Asn 


Pro 


450 










455 








460 










Gin Gly 


Phe 


Val 


Leu 




Leu 


Cys 


is u 


Gin 


ys 


Val 


ys 


His 


Tyr 










470 








475 










480 


Leu He 


Leu 


Pro 


Ser 


Glu 


Glu 


Glu 


Gly Arg 


Leu 


Tyr 


Phe 


Ser 


Met 


Asp 








485 








490 










495 




Asp Gly 


Gin 


Thr 


Arg 


Phe 


Thr 


Asp 


Leu Leu 
505 


Gin 


Leu 


Val 


Glu 
510 


Phe 


His 


Gin Leu 


Asn 


500 
Arg 


Gly 


He 


Leu 


Pro 


Cys Leu 


Leu 


Arg 


His 


Cys 


Cys 


Thr 




515 








520 








525 








Arg Val 


Ala 


Leu 
























530 




























<210> 38 


























<211> 534 


























<212> PRT 


























<213> I 


iomo 
























<400> 38 


























Met Lys 


Gin 


Glu 


Gly 


Ser 


Ala 


Arg 


Arg Arg 


Y 


a 


As 

P 


ys 


Ala 


L 

ys 


1 






5 






















Pro Pro 


Pro 


Gly 


Gly 


Gly 


Glu 


Gin 


Glu Pro 


ro 


ro 


ro 


Pro 
30° 


Ala 
a 


Pro 






20 
























Val 


Glu 


Met 


Lys 


Glu 


Glu 


1 

Ala Ala 


r 


y 




Gl 

Y 


Ser 
er 


Thr 


35 










40 
















Gly Glu 


Ala 


Asp 


Gly 


Lys 


^ r 




Ala a 


Ala 


Val 


Glu 


His 
is 


Ser 


Gin 


50 


















60 










Arg Glu 
65 




Asp 


Thr 


a 


r 


U 


As 
u p 




ys 


Glu 
U 


His 


Val 


Lys 


























80 


Gin Leu 


Glu 


Lys 


Ala 


a 


er 


Y 


L s Glu 


Pro 




Phe 


Val 


Leu 


Arg 






85 








^ 90 










95 




Ala Leu 


Arg 


Met 


Leu 


Pro 


Ser 


Thr 


l ns 9 


g 


Leu 


Asn 


His 


Tyr 


Val 




100 


















110 






Leu Tyr 




Ala 


Val 


Gin 


Gly 




Phe Thr 


er 


n 


1 


Ala 


Thr 
r 


g 


115 










120 
















Asp Phe 


Leu 




Pro 


Phe 




Glu 


Glu Pro 


Met 


Asp 


Thr 


Glu 


3 


As 

P 


130 










135 


















Leu Gin 


Phe 


Arg 


Pro 


Arg 


Thr 


Gly 


Lys Ala 


Ala 


Ser 


Thr 


Pro 


Leu 


Leu 


145 






150 








155 










160 


Pro Glu 


Val 


Glu 


Ala 


Tyr 


Leu 


Gin 


Leu Leu 


Val 


Val 


He 


Phe 


Met 


Met 








165 






170 










175 






Lys 


Arg 






Glu 


Ala 


Gin Lys 


He 


Ser 


Asp 


Asp 


Leu 


Met 




180 










185 








190 






Gin Lys 


He 


Ser 


Thr 


Gin 


Asn 


Arg 


Arg Ala 


Leu 


Asp 




Val 


Ala 


Ala 


195 










200 








205 









130 



EP 1 365 034 A2 



Lys 


Cys 


Tyr Tyr 


Tyr His Ala 


Arg Val Tyr Glu 


Phe Leu Asp 


Lys Leu 


210 




215 




220 




Asp 


Val 


Val Arg 


Ser Phe Leu 


His Ala Arg Leu 


Arg Thr Ala 


Thr Leu 


225 






230 


235 




240 


Arg 


His 


Asp Ala 


Asp Gly Gin 


Ala Thr Leu Leu 


Asn Leu Leu 


Leu Arg 






245 


250 




255 


Asn 


Tyr 


Leu His 


Tyr Ser Leu 


Tyr Asp Gin Ala 


Glu Lys Leu 


Val Ser 




260 


265 


270 




Lys 


Sar 


Val Phe 


Pro Glu Gin 


Ala Asn Asn Asn 


Glu Trp Ala 


Arg Tyr 




275 




280 


285 




Leu 


Tyr 


Tyr Thr 


Gly Arg He 


Lys Ala He Gin 


Leu Glu Tyr 


Ser Glu 




290 




295 




300 




Ala 


Arg 


Arg Thr 


Met Thr Asn 


Ala Leu Arg Lys 


Ala Pro Gin 


His Thr 


305 


310 


315 




320 


Ala 


Val 


Gly Phe 


Lys Gin Thr 


Val His Lys Leu 


Leu He Val 


Val Glu 






325 


330 




335 


Lau 


Lau 


Leu Gly 


Glu He Pro 


Asp Arg Leu Gin 


Phe Arg Gin 


Pro Ser 






340 




345 


350 




Lau 


Lys 


Arg Ser 


Leu Met Pro 


Tyr Phe Leu Leu 


Thr Gin Ala 


Val Arg 




355 




360 


365 




Thr 


Gly 


Asn Leu 


Ala Lys Phe 


Asn Gin Val Leu 


Asp Gin Phe 


Gly Glu 




370 




375 




380 




Lys 


Phe 


Gin Ala 


Asp Gly Thr 


Tyr Thr Leu He 


He Arg Leu 


Arg His 


385 






390 


395 




400 


Asn 


Val 


He Lys 


Thr Gly Val 


Arg Mat He Ser 


Leu Ser Tyr 


Ser Arg 






405 


410 




415 


Ila 


Ser 


Leu Ala 


Asp Ila Ala 


Gin Lys Leu Gin 


Leu Asp Ser 


Pro Glu 






420 


425 


430 




Asp 


Ala 


Glu Phe 


He Val Ala 


Lys Ala He Arg 


Asp Gly Val 


He Glu 




435 




440 


445 




Ala 


Ser 


He Asn 


His Glu Lys 


Gly Tyr Val Gin 


Ser Lys Glu 


Met He 




450 




455 




460 




Asp 


lie 


Tyr Ser 


Thr Arg Glu 


Pro Gin Leu Ala 


Phe His Gin 


Arg He 


465 




470 


475 






Ser 


Phe 


Cys Leu 


Asp He His 


Asn Met Ser Val 


Lys Ala Met 


Arg Phe 






485 


490 




495 


Pro 


Pro 


Lys Ser 


Tyr Asn Lys 


Asp Leu Glu Ser 


Ala Glu Glu 


Arg Arg 






500 




505 


510 




Glu 


Arg 


Glu Gin 


Gin Asp Leu 


Glu Phe Ala Lys 


Glu Met Ala 


Glu Asp 




515 




520 


525 




Asp 


Asp 


Asp Ser 


Phe Pro 










530 













<210> 39 

<211> 207 

<212> PRT 

<213> Homo sapiens 



<400> 39 

Met Ala Gly Pro Ala Thr Gin Ser Pro Met Lys Leu Met Ala Leu Gin 

15 10 15 

Leu Leu Leu Trp His Ser Ala Leu Trp Thr Val Gin Glu Ala Thr Pro 



131 



EP 1 365 034 A2 



Leu Gly 


Pro 


Ala 


Ser 


Ser 


Leu 


Pro Gin 


Ser 


Phe 


Leu 


Leu 


Lys 


Cys 


Leu 


35 










40 








45 








Glu Gin 


Val 


Arg 


Lys 


He 


Gin 


Gly Asp 


Gly 


Ala 


Ala 


Leu 


Gin 


Glu 


Lys 


50 










55 








60 










Leu Val 


Ser 


Glu 


Cys 


Ala 


Thr 


Tyr Lys 


Leu 


Cys 


His 


Pro 


Glu 


Glu 


Leu 


65 






70 








75 










80 


Val Leu 


Leu 


Gly 


His 


Ser 


Leu 


Gly He 


Pro 


Trp 


Ala 


Pro 


Leu 


Ser 


Ser 






85 








90 










95 




Cya Pro 


Ser 


Gin 


Ala 


Leu 


Gin 


Leu Ala 


Gly 


Cys 


Leu 


Ser 


Gin 


Leu 


His 




100 








105 










110 






Ser Gly 


Leu 


Phe 


Leu 


Tyr 


Gin 


Gly Leu 


Leu 


Gin 


Ala 


Leu 


Glu 


Gly 


He 


115 










120 








125 








Ser Pro 


Glu 


Leu 


Gly 


Pro 


Thr 


Leu Asp 


Thr 


Leu 


Gin 


Leu 


Asp 


Val 


Ala 


130 










135 








140 










Asp Phe 


Ala 


Thr 


Thr 


He 


Trp 


Gin Gin 


Met 


Glu 


Glu 


Leu 


Gly 


Met 


Ala 


145 








150 








155 










160 


Pro Ala 


Leu 


Gin 


Pro 


Thr 


Gin 


Gly Ala 


Met 


Pro 


Ala 


Phe 


Ala 


Ser 


Ala 
















170 










175 




Phe Gin 


Arg 


Arg 


Ala 


Gly 


Gly 


Val Leu 


Val 


Ala 


Ser 


His 




Gin 


Ser 














185 










190 






Phe Leu 


Glu 


Val 


Ser 


Tyr 


Arg 


Val Leu 


Arg 


His 


Leu 


Ala 


Gin 


Pro 






195 


















205 








<210> 40 


























<211> 989 


























<212> PRT 


























<213> Homo 


sapiens 






















<400> 40 


























Met Lys Val 


Val 


Asn 


Leu 


Lys 


Gin Ala 


He 


Leu 


Gin Ala 


Trp 


Lys 


Glu 


1 






5 








10 










15 




Arg Trp 


Ser 




Tyr 


Gin 


Trp 


Ala He 


Asn 


Met 


Lys Lys 


Phe 


Phe 


Pro 




20 








25 










30 






Lys Gly 


Ala 


Thr 


Trp 


Asp 


He 


Leu Asn 


Leu 


Ala 


Asp 


Ala 


Leu 


Leu 


Glu 


35 










40 








45 








Gin Ala 


Met 


He 


Gly 


Pro 


Ser 


Pro Asn 


Pro 


Leu 


lie 


Leu 


Ser 


Tyr 


Leu 


50 








55 








60 










Lys Tyr 


Ala 


He 


Ser 


Ser 


Gin 


Met Val 


Ser 


Tyr 


Ser 


Ser 


Val 


Leu 


Thr 


65 








70 








75 










80 


Ala He 


Ser 


Lys 


Phe 


Asp 


Asp 


Phe Ser 


Arg 


Asp 


Leu 


Cys 


Val 


Gin 


Ala 






85 








90 










95 






Asp 


He 


Met 


Asp 


Met 


Phe Cys 


Asp 


Arg 


Leu 


Ser 


Cys 


His 


Gly 






100 






105 










110 






Lys Ala 


Glu 


Glu 


Cys 


He 


Gly 


Leu Cys 


Arg 


Ala 


Leu 


Leu 


Ser 


Ala 


Leu 


115 








120 








125 








His Trp 


Leu 


Leu 


Arg 


Cys 


Thr 


Ala Ala 


Ser 


Ala 


Glu Arg 




Arg 




130 










135 








140 










Gly Leu 


Glu 


Ala 


Gly 


Thr 


Pro 


Ala Ala 


Gly 


Glu 


Lys Gin 


Leu 


Ala 


Met 


145 








150 








155 










160 


Cys Leu 


Gin 


Arg 


Leu 


Glu 


Lys 


Thr Leu 


Ser 


Ser 


Thr Lys 


Asn 


Arg 


Ala 






165 








170 










175 




Leu Leu 


His 


He 


Ala 


Lys 


Leu 


Glu Glu 


Ala 


Ser 


Ser 


Trp 


Thr 


Ala 


He 






180 






185 










190 







132 



EP 1 365 034 A2 



Glu 


His Ser 


Leu 


Leu 


Lys 


Leu 


Gly 


Glu 


He 


Leu 


Thr 


Asn Leu 


Ser 


Asn 




195 










200 










205 






Pro 


Gin Leu 


Arg 


Ser 


Gin 


Ala 


Glu 


Gin 


Cys 


Gly 


Thr 


Leu He 


Arg 


Ser 




210 








215 










220 








lie 


Pro Thr 


Met 


Leu 


Ser 


Val 


His 


Ala 


Glu 


Gin 


Met 


His Lys 


Thr 


Gly 


225 








230 










235 








240 


Phe 


Pro Thr 


Val 


His 


Ala 


Val 


He 


Leu 


Leu 


Glu 


Gly 


Thr Met 


Asn 


Leu 








245 










250 








255 




Thr 


Gly Glu 


Thr 


Gin 


Ser 


Leu 


Val 


Glu 


Gin 


Leu 


Thr 


Met Val 


Lys 


Arg 




260 










265 








270 






Met 


Gin His 


He 


Pro 


Thr 


Pro 


Leu 


Phe 


Val 


Leu 


Glu 


He Trp 


Lys 


Ala 




275 










280 










285 






Cys 


Phe Val 


Gly 


Leu 


He 


Glu 


Ser 


Pro 


Glu 


Gly 


Thr 


Glu Glu 


Leu 


Lys 


290 








295 










300 








Trp 


Thr Ala 


Phe 


Thr 


Phe 


Leu 


Lys 


He 


Pro 


Gin 


Val 


Leu Val 


Lys 


Leu 


305 








310 










315 








320 


Lys 


Lys Tyr 


Ser 


His 


Gly 


Asp 


Lys 


Asp 


Phe 


Thr 


Glu 


Asp Val 


Asn 


Cys 




325 










330 








335 




Ala 


Phe Glu 


Phe 


Leu 


Leu 


Lys 


Leu 


Thr 


Pro 


Leu 


Leu 


Asp Lys 


Ala 


Asp 






340 










345 








350 






Gin 


Arg Cys 


Asn 


Cys 


Asp 


Cys 


Thr 


Asn 


Phe 


Leu 


Leu 


Gin Glu 


Cys 


Gly 




355 








360 










365 






Lys 


Gin Gly 


Leu 


Leu 


Ser 


Glu 


Ala 


Ser 


Val 


Asn 


Asn 


Leu Met 


Ala 


Lys 


370 








375 










380 








Arg 


Lys Ala 


Asp 


Arg 


Glu 


His 


Ala 


Pro 


Gin 


Gin 


Lys 


Ser Gly 


Glu 


Asn 


385 






390 










395 








400 


Ala 


Asn He 


Gin 


Pro 


Asn 


He 


Gin 


Leu 


He 


Leu 


Arg 


Ala Glu 


Pro 


Thr 








405 










410 








415 




Val 


Thr Asn 


He 


Leu 


Lys 


Thr 


Met 


Asp 


Ala 


Asp 


His 


Ser Lys 


Ser 


Pro 






420 








425 








430 






Glu 


Gly Leu 
435 


Leu 


Gly 


Val 


Leu 


Gly 


His 


Met 


Leu 


Ser 


Gly Lys 


Ser 


Leu 










440 










445 






Asp 


Leu Leu 


Leu 


Ala 


Ala 


Ala 


Ala 


Ala 


Thr 


Gly 


Lys 


Leu Lys 


Ser 


Phe 




450 








455 










460 








Ala 


Arg Lys 


Phe 


He 


Asn 


Leu 


Asn 


Glu 


Phe 


Thr 


Thr 


Tyr Gly 


Ser 


G q U 


465 






470 










475 








480 


Glu 


Ser Thr 


Lys 


Pro 


Ala 


Ser 


Val 


Arg 


Ala 


Leu 


Leu 


Phe Asp 


He 


Ser 






485 










490 








495 




Phe 


Leu Met 


Leu 


Cys 


His 


Val 


Ala 


Gin 


Thr 


Tyr 


Gly 


Ser Glu 


Val 


He 






500 








505 








510 






Leu 


Ser Glu 


Ser 


Arg 


Thr 


Gly 


Ala 


Glu 


Val 


Pro 


Phe 


Phe Glu 


Thr 


Trp 




515 








520 










525 






Met 


Gin Thr 


Cys 


Met 


Pro 


Glu 


Glu 


Gly 


Lys 


He 


Leu 


Asn Pro 


Asp 


His 




530 








535 










540 








Pro 


Cys Phe 


Arg 


Pro 


Asp 


Ser 


Thr 


Lys 


Val 


Glu 


Ser 


Leu Val 


Ala 


Leu 


545 






550 










555 








56 


Leu 


Asn Asn 


Ser 


Ser 


Glu 


Met 


Lys 


Leu 


Val 


Gin 


Met 


Lys Trp 


His 


Glu 








565 










570 








575 




Ala 


Cys Leu 


Ser 


He 


Ser 


Ala 


Ala 


He 


Leu 


Glu 


He 


Leu Asn 


Ala 


Trp 




580 










585 








590 






Glu 


Asn Gly 


Val 




Ala 


Phe 


Glu 


Ser 


He 


Gin 


Lys 


He Thr 


Asp 


Asn 




595 










600 










605 






He 


Lys Gly 


Lys 


Val 


Cys 


Ser 


Leu 


Ala 


Val 


Cys 


Ala 


Val Ala 


Trp 


Leu 




610 








615 










620 












Val 


Arg 


Met 




Gly 


Leu 


Asp 


Glu 


Arg 


Glu Lys 


Ser 




625 






630 










635 








640 


Gin 


Met He 


Arg 


Gin 


Leu 


Ala 


Gly 


Pro 


Leu 


Phe 


Ser 


Glu Asn 


Thr 


Leu 






645 










650 








655 


















133 















EP 1 365 034 A2 



Gin Phe Tyr 


Asn 


Glu 


Arg Val Val He Met Asn Ser 




u 


Arg 


660 




665 








Mat Cys Ala 


Asp 


Val 


Leu Gin Gin Thr Ala Thr Gin 




Phe 


ro 


675 






680 


685 






Ser Thr Gly 


Val 


Asp 


Thr Met Pro Tyr Trp Asn Leu 


Leu Pro 


Pro 


ys 


690 






695 700 








Arg Pro lie 


Lys 


Glu 


Val Leu Thr Asp He Phe Ala 


Lys Val 


Leu 




705 




710 715 






720 


Lys Gly Trp 


Val 


Asp 


Ser Arg Ser He His He Phe 


Asp Thr 




u 




725 


730 




735 




His Met Gly 


Gly 


Val 


Tyr Trp Phe Cys Asn Asn Leu 


He Lys 


Glu 


Leu 


740 




745 








Leu Lys Glu 


Thr 


Arg 


Lys Glu His Thr Leu Arg Ala 


Val Glu 


Leu 


Leu 


755 






760 








Tyr Ser lie 


Phe 


Cys 


Leu Asp Met Gin Gin Val Thr 


Leu Va 


Leu 


U 


770 






775 780 








Gly His He 


Leu 


Pro 


Gly Leu Leu Thr Asp Ser Ser 


Lys Trp 


His 




785 






790 795 






800 


Leu Met Asp 


Pro 


Pro 


Gly Thr Ala Leu Ala Lys Leu 


Ala Val 


Trp 


Cys 




805 


810 




815 




Ala Leu Ser 


Ser 


Tyr 


Ser Ser His Lys Gly Gin Ala 


Ser Thr 


Arg 


Gin 




820 


825 


830 






Lys Lys Arg 


His 


Arg 


Glu Asp He Glu Asp Tyr He 


Ser Leu 


Phe 


Pro 


835 






840 


845 






Leu Asp Asp 


Val 


Gin 


Pro Ser Lys Leu Met Arg Leu 


Leu Ser 


Ser 


Asn 


850 






855 860 








Glu Asp Asp 
865 


Ala 


Asn 


He Leu Ser Ser Pro Thr Asp 


Arg Ser 


Met 








870 875 






ftflO 


Ser Ser Leu 


Ser 


Ala 


Ser Gin Leu His Thr Val Asn 


Met Arg 


Asp 


ro 






885 


890 




895 




Leu Asn Arg 


Val 


Leu 


Ala Asn Leu Phe Leu Leu He 


Ser Ser 


He 


Leu 


900 




905 


910 






Gly Ser Arg 


Thr 


Ala 


Gly Pro His Thr Gin Phe Val 


925 


Phe 


Met 


915 














Glu Glu Cys 


Val 


Asp 


Cys Leu Glu Gin Gly Gly Arg 


Gly Ser 


Val 


Leu 


930 






935 940 








Gin Phe Met 


Pro 


Phe 


Thr Thr Val Ser Glu Leu Val 


Lys Val 


Ser 


Ala 


945 






950 955 






960 


Met Ser Ser 


Pro 


Lys 


Val Val Leu Ala He Thr Asp 


Leu Ser 


Leu 


Pro 






965 


970 




975 




Leu Gly Arg 


Gin 


Val 


Ala Ala Lys Ala He Ala Ala 


Leu 








980 




985 








<210> 41 














<211> 490 














<212> PRT 














<213> Homo 


sapiens 










<400> 41 














Met Glu Gin 


Lys 


Pro 


Ser Lys Val Glu Cys Gly Ser 


Asp Pro 


Glu 


Glu 


1 


5 


10 




15 




Asn Ser Ala 


Arg 


Ser 


Pro Asp Gly Lys Arg Lys Arg 


Lys Asn 


Gly 


Gin 




20 




25 


30 







134 



EP 1 365 034 A2 





Cys 


Ser 


Leu 


Lys 


Thr 


Ser 


Met 








35 












Lys 


Asp 


Glu 


Gin 


Cys 


Val 


Val 


5 


50 










55 




Tyr 


Arg 


Cys 


He 


Thr 


Cys 


Glu 




65 










70 






lie 


Gin 


Lys 


Asn 


Leu 


His 


Pro 












85 






10 


Cys 


Val 


He 


Asp 


Lys 


He 


Thr 








100 










Lys 


Lys 


Cys 


He 


Ala 


Val 


Gly 








115 












Sar 


Lys 


Arg 


Val 


Ala 


Lys 


Arg 






130 










135 


15 


Arg 


Arg 


Lys 


Glu 


Glu 


Met 


He 




145 










150 






Thr 


Pro 


Glu 


Glu 


Trp 


Asp 


Leu 












165 








Ser 


Thr 


Asn 


Ala 


Gin 


Gly 


Ser 


20 








180 










Pro 


Asp 


Asp 


He 


Gly 


Gin 


Ser 








195 












Lys 


Val 


Asp 


Leu 


Glu 


Ala 


Phe 






210 










215 


25 


Ala 


He 


Thr 


Arg 


Val 


Val 


Asp 


225 










230 






Glu 


Leu 


Pro 


Cys 


Glu 


Asp 


Gin 












245 








Glu 


He 


Met 


Ser 


Leu 


Arg 


Ala 










260 








30 


Thr 


Leu 


Thr 


Leu 


Ser 


Gly 


Glu 








275 












Asn 


Gly 


Gly 


Leu 


Gly 


Val 


Val 




















Ser 


Leu 


Ser 


Ala 


Phe 


Asn 


Leu 




305 










310 




35 


Ala 


Val 


Leu 


Leu 


Met 


Ser 


Thr 












325 








Lys 


He 


Glu 


Lys 


Ser 


Gin 


Glu 










340 










Val 


Asn 


His 
355 


Arg 


Lys 


His 


Asn 


40 


Met 


Lys 


Glu 


Arg 


Glu 


Val 


Gin 






370 










375 




Ala 


Glu 


Gly 


Arg 


Pro 


Gly 


Gly 




385 










390 






Gin 


Leu 


Leu 


Gly 


Met 


His 


Val 


45 










405 








Glu 


Gin 


Gin 


Leu 


Gly 


Glu 


Ala 










420 










His 


Gin 


Ser 


Pro 




Ser 


Pro 








435 










50 


Arg 


Ser 


Gly 


He 


Leu 


His 


Ala 






450 










455 




Ser 


Glu 


Ala 


Asp 


Ser 


Pro 


Ser 




465 










470 






Glu 


Asp 


Leu 


Ala 


Gly 


Asn 


Ala 










485 







55 



Ser 


Gly 


Tyr 


He 


Pro 


Ser Tyr 


Leu 


Asp 


40 










45 






Cys 


Gly 


Asp 


Lys 


Ala 


Thr Gly 


Tyr 


His 










60 








Gly 


Cys 


Lys 


Gly 


Phe 


Phe Arg 


Arg 


Thr 








75 








80 


Thr 


Tyr 


Ser 


Cys 


Lys 


Tyr Asp 


Ser 


Cys 






90 








95 




Arg 


Asn 


Gin 


Cys 


Gin 


Leu Cys 


Arg 


Phe 


105 








110 






Met 


Ala 


Met 


Asp 


Leu 


Val Leu 


Asp 


Asp 


120 










125 






Lys 


Leu 


He 


Glu 


Gin 


Asn Arg 


Glu 


Arg 










140 








Arg 


Ser 


Leu 


Gin 


Gin 


Arg Pro 


Glu 


Pro 






155 








160 


He 


His 


He 


Ala 


Thr 


Glu Ala 


His 


Arg 






170 








175 




His 


Trp 


Lys 


Gin 


Arg 


Arg Lys 


Phe 


Leu 




185 








190 






Pro 


He 


Val 


Ser 


Met 


Pro Asp 


Gly 


Asp 


200 










205 






Ser 


Glu 


Phe 


Thr 


Lys 


He He 


Thr 


Pro 










220 








Phe 


Ala 


Lys 


Lys 


Leu 


Pro Met 


Phe 


Ser 








235 








240 


He 


He 


Leu 


Leu 


Lys 


Gly Cys 


Cys 


Met 






250 








255 




Ala 


Val 


Arg 


Tyr 


Asp 


Pro Glu 


Ser 


Asp 




265 








270 






Met 


Ala 


Val 


Lys 


Arg 


Glu Gin 


Leu 


Lys 


280 










285 






Ser 


Asp 


Ala 


He 


Phe 


Glu Leu 


Gly 


Lys 










300 








Asp 


Asp 


Thr 


Glu 


Val 


Ala Leu 


Leu 


Gin 








315 








320 


Asp 


Arg 


Ser 


Gly 


Leu 


Leu Cys 


Val 


Asp 




330 








335 




Ala 


Tyr 


Leu 


Leu 


Ala 


Phe Glu 


His 


Tyr 




345 








350 






He 


Pro 


His 


Phe 


Trp 


Pro Lys 


Leu 


Leu 


360 










365 






Ser 


Ser 


He 


Leu 


Tyr 


Lys Gly 


Ala 


Ala 










380 








Ser 


Leu 


Gly 


Val 


His 


Pro Glu 


Gly 


Gin 








395 








400 


Val 


Gin 


Gly 


Pro 


Gin 


Val Arg 


Gin 


Leu 






410 








415 




Gly 


Ser 


Leu 


Gin 


Gly 


Pro Val 


Leu 


Gin 




425 








430 






Gin 


Gin 






Leu 


Glu Leu 


Leu 


His 


440 










445 






Arg 


Ala 


Val 


Cys 


Gly 




Asp 


Ser 










460 








Ser 


Ser 


Glu 


Glu 


Glu 


Pro Glu 


Val 


Cys 








475 








480 


Ala 


Ser 


Pro 













490 



135 



EP 1 365 034 A2 



<210> 42 

<211> 614 

<212> PRT 

<213> Homo sapiens 



10 



<400> 42 
















Met 


Thr 


Thr 


Leu 


Asp Ser Asn Asn 


Asn Thr Gly Gly 


Val 


He 


Thr 


Tyr 


1 








5 


10 






15 




lie 


Gly 


Ser 


Ser 


Gly Ser Ser Pro 


Ser Arg Thr Ser 


Pro 


Glu 


Ser 


Leu 






20 




25 




30 






Tyr 


Ser 


Asp 

35 


Asn 


Ser Asn Gly Ser 


Phe Gin Ser Leu 


Thr 


Gin 


Gly 


Cys 






40 




45 








Pro 


Thr 


Tyr 


Phe 


Pro Pro Ser Pro 


Thr Gly Ser Leu 


Thr 


Gin 


Asp 


Pro 




50 




55 


60 










Ala 


Arg 


Ser 


Phe 


Gly Ser He Pro 


Pro Ser Leu Ser 


Asp 


Asp 


Gly 


Ser 


65 






70 


75 








80 


Pro 


Ser 


Ser 


Ser 


Ser Ser Ser Ser 


Ser Ser Ser Ser 


Ser 


Phe 


Tyr 


Asn 










85 


90 






95 




Gly 


Ser 


Pro 


Pro 


Gly Ser Leu Gin 


Val Ala Met Glu 


Asp 


Ser 


Ser 


Arg 






100 




105 




110 






Val 


Ser 


Pro 


Ser 


Lys Ser Thr Ser 


Asn He Thr Lys 


Leu 


Asn 


Gly 


Met 






115 




120 




125 








Val 


Leu 


Leu Cys 


Lys Val Cys Gly 


Asp Val Ala Ser 


Gly 


Phe 


His 


Tyr 




130 






135 


140 










Gly 


Val 


Leu 


Ala 


Cys Glu Gly Cys 


Lys Gly Phe Phe 


Arg 


Arg 


Ser 




145 








150 


155 








160 


Gin 


Gin 


Asn 


He 


Gin Tyr Lys Arg 


Cys Leu Lys Asn 


Glu 


Asn 


Cys 


Ser 










165 


170 






175 




He 


Val 


Arg 


He 


Asn Arg Asn Arg 


Cys Gin Gin Cys 


Arg 


Phe 


Lys 


Lys 






180 




185 










Cys 


Leu 


Ser Val 


Gly Met Ser Arg 


Asp Ala Val Arg 


Phe 


Gly 


Arg 


He 




195 




200 




205 








Pro 


Lys 


Arg Glu 


Lys Gin Arg Met 


Leu Ala Glu Met 


Gin 


Ser 


Ala 


Met 






















Asn 


Leu 


Ala 


Asn 


Asn Gin Leu Ser 


Ser Gin Cys Pro 


Leu 


Glu 


Thr 


Ser 


225 








230 


235 








240 


Pro 


Thr 


Gin 


His 


Pro Thr Pro Gly 


Pro Met Gly Pro 


Ser 


Pro 


Pro 


Pro 










245 


250 






255 




Ala 


Pro 


Val 


Pro 


Ser Pro Leu Val 


Gly Phe Ser Gin 


Phe 


Pro 


Gin 


Gin 








260 




265 




270 






Leu 


Thr 


Pro 


Pro 


Arg Ser Pro Ser 


Pro Glu Pro Thr 


Val 


Glu 


Asp 


Val 






275 




280 




285 








He 


Ser 


Gin Val 


Ala Arg Ala His 


Arg Glu He Phe 


Thr 


Tyr 


Ala 


His 




290 






295 


300 










Asp 


Lys 


Leu 


Gly 


Ser Ser Pro Gly 


Asn Phe Asn Ala 


Asn 


His 


Ala 


Ser 


305 






310 


315 








320 


Gly 


Ser 


Pro 


Pro 


Ala Thr Thr Pro 


His Arg Trp Glu 


Asn 


Gin 


Gly 


Cys 








325 


330 






335 




Pro 


Pro 


Ala 


Pro 


Asn Asp Asn Asn 


Thr Leu Ala Ala 


Gin 


Arg 


His 


Asn 








340 


345 




350 






Glu 


Ala 




Asn 


Gly Leu Arg Gin 


Ala Pro Ser Ser 


Tyr 


Pro 


Pro 


Thr 






355 




360 




365 









136 





Trp 


Pro 


Pro 


Gly 


Pro 


Ala 






370 












Gly 


His 


Arg 


Leu 


Cys 


Pro 


5 


385 










390 




Ala 


Pro 


Ala 


Asn 


Ser 


Pro 












405 






Ala 


Cys 


Pro 


Met 


Asn 


Mat 










420 






10 


Gin 


Glu 


Ila 


Trp 


Glu 


Asp 






435 










Glu 


Val 


Val 


Glu 


Phe 


Ala 






450 












Gin 


His 


Asp 


Gin 


Val 


Thr 




465 










470 


15 


Met 


Val 


Arg 


Phe 


Ala 


Ser 












485 






Phe 


Leu 


Ser 


Arg 


Thr 


Thr 










500 








Met 


Gly 


Asp 


Leu 


Leu 


Ser 


20 






515 








Ser 


Leu 


Ala 


Leu 


Thr 


Glu 






530 












Leu 


Val 


Ser 


Ala 


Asp 


Arg 




545 










550 






Leu 


Gin 


Glu 


Thr 


Leu 


25 










565 






Asn 


Arg 


Pro 


Leu 


Glu 


Thr 










580 








Pro 


Asp 


Leu 


Arg 


Thr 


Leu 








595 










Phe 


Arg 


Val 


Asp 


Ala 


Gin 






610 












<210> 43 










<211> 703 








35 
















<212> I 


?RT 










<213> I 


lomo 


sapi 


Lens 




40 


<400> 43 










Mat 


Ala 


Asp 


Arg 


Arg 


Arg 




1 








5 






Glu 


Ser 


Gly 


Ala 


Ser 


Gly 


45 








20 








Gly 


Gly 


Ser 


Cys 


Ser 


Gly 








35 










Pro 


Ser 


Gin 


Arg 










50 










50 


Glu 


Ser 


Gly 


Gly 


Ala 






65 










70 




Asp 


Gly 


He 


Glu 


Gly 


Asp 












85 






Asp 


Ser 


Glu 


Gly 


Glu 


Glu 








100 







55 



EP 1 365 034 A2 



His His 


Ser 


Cys 


His 


Gin 


Ser 


Asn 


Ser 


Asn 


375 








380 










Thr His 


Val 


Tyr 


Ala 


Ala 


Pro 


Glu 


Gly 








395 










400 


Arg Gin 


Gly 


Asn 


Ser 


Lys 


Asn 


Val 


Leu 


Leu 




410 










415 




Tyr Pro 


His 


Gly 


Arg 


Ser 


Gly 


Arg 


Thr 


Val 


425 










430 






Phe Ser 


Met 


Ser 


Phe 


Thr 


Pro 


Ala 


Val 


Arg 


440 










445 








Lys His 


He 


Pro 


Gly 


Phe 


Arg 


Asp 


Leu 


Ser 


455 








460 










Leu Leu 


Lys 


Ala 


Gly 


Thr 


Phe 


Glu 


Val 








475 












Leu Phe 


Asn 


Val 


Lys 


Asp 


Gin 


Thr 


Val 


Met 






490 










495 




Tyr Ser 


Leu 


Gin 


Glu 


Leu 


Gly 


Ala 


Met 


Gly 


505 










510 






Ala Met 


Phe 


Asp 


Phe 


Ser 


Glu 


Lys 


Leu 


n 


520 










525 








Glu Glu 


Leu 


Gly 


Leu 




Thr 


Ala 


Val 


a 


535 








540 










Ser Gly 


Met 


Glu 


Asn 


Ser 


Ala 


Ser 


Val 


Glu 






555 










560 


Leu Arg 


Ala 


Lau 


Arg 


Ala 


Leu 


Val 


Lau 


Lys 




570 










575 




Ser Arg 


Phe 


Thr 


Lys 


Leu 


Leu 


Leu 


Lys 


Leu 


585 










590 






Asn Asn 


Met 


His 


Ser 


Glu 


Lys 


Leu 


Leu 


Ser 


600 


















Gin Arg 


Ala 


Ser 


Gin 


Asp 


Thr 


u 


Asp 


u 




10 










15 




Ser Asp 


Ser 


Gly 


Gly 


Ser 


Pro 


Leu 


Arg 


Gly 




25 










30 






Ser Ala 


Gly 


Gly 


Gly 


Gly 


Ser 


Gly 


Ser 


Leu 


40 




















Gl 


Ala 


Leu 


His 


Leu 


Arg 


Arg 


Val 


55 








60 










Ser Ala 


Glu 


Glu 


Ser 


Glu 


Cys 


Glu 


Ser 


Glu 








75 










80 


Ala Val 


Leu 


Ser 


Asp 




Glu 


Ser 


Ala 


Glu 






90 










95 




Gly Glu 




Ser 


Glu 


Glu 


Glu 


Asn 


Ser 


Lys 


105 










110 







137 



EP 1 365 034 A2 



Val 


Glu Leu 


Lys 


Ser 


Glu 


Ala 


Asn 


Asp 


Ala 


Val Asn 




Ser 


Thr 


Lys 




115 








120 
















Glu 


Glu Lys 
130 


Gly 


Glu 


Glu 


135 




Asp 


Thr 


Lys Ser 
140 


Thr 


Val 


Thr 


Gly 


Glu 


Arg Gin 


Ser 


Gly 


Asp 


Gly 


















145 






150 










155 








160 


Lys 


Val Gly 


Lys 


Lys 


Gly 


Pro 


Lys 


His 


Leu 


Asp Asp 


Asp 


Glu 


Asp 


Arg 


165 










170 








175 




Lys 


Asn Pro 


Ala 


Tyr 


He 


Pro 


Arg 


Lys 


Gly 


Leu Phe 


Phe 




His 


Asp 




180 










185 








190 






Leu 


Arg Gly 
195 


Gin 


Thr 


Gin 


Glu 


Glu 
200 


Glu 


Val 


Arg Pro 




Gly 


Arg 


Gin 


Arg 


Lys Leu 

210 


Trp 


Lys 


Asp 


Glu 
215 


Gly 


Arg 


Trp 


Glu His 
220 


Asp 


Lys 


Phe 


Arg 


Glu 


Asp Glu 


Gin 


Ala 


Pro 


Lys 


Ser 


Arg 


Gin 




He 


Ala 


Leu 


Tyr 


225 






230 


















240 


Gly 


Tyr Asp 


Ha 


Arg 


Ser 


Ala 


His 


Asn 


Pro 


Asp Asp 


He 


Lys 




Arg 




245 










250 












Arg 


He Arg 


Lys 


Pro 


Arg 


Tyr 


Gly 


Ser 


Pro 


Pro Gin 


Arg 


Asp 


Pro 


n 


260 










265 








270 






Trp 


Asn Gly 


Glu 


Arg 


Leu 


Asn 


Lys 


Ser 


His 


Arg His 


Gin 


Gly 


Leu 


Gly 


275 










280 








285 








Gly 


Thr Leu 


Pro 


Pro 


Arg 


Thr 


Phe 


He 


Asn 


Arg Asn 


Ala 


Ala 


Gly 


r 


290 








295 


















Gly 


Arg Met 


Ser 


Ala 


Pro 


Arg 


Asn 


Tyr 




Ar S 


Gly 


Gly 


Phe 


Lys 


305 






310 










315 








320 


Glu 


Gly Arg 


Ala 


Gly 


Phe 


Arg 


Pro 


Val 


Glu 


Ala Gly 


Gly 


Gin 




Gly 






325 










330 








335 




Gly 


Arg Ser 


Gly 


Glu 


Thr 


Val 


Lys 


His 


Glu 


He Ser 


Tyr 


Arg 


Ser 


Arg 


340 










345 








350 






Arg 


Leu Glu 


Gin 


Thr 


Ser 


Val 


Arg 


Asp 


Pro 


Ser Pro 


Glu 


Ala 


Asp 


Ala 


355 










360 








365 








Pro 


Val Leu 

370 


Gly 


Ser 


Pro 


Glu 
375 


Lys 


Glu 




380 










Ala 


Ala Ala 


Pro 


Asp 


Ala 


Ala 


Pro 


Pro 


Pro 




Arg 


Pro 


He 




385 






390 


















400 


Lys 


Lys Ser 


Tyr 


Ser 


Arg 


Ala 


Arg 


Arg 


Thr 


Arg Thr 


Lys 


V 


Gl 


As 

P 














410 












Ala 


Val Lys 




Al 5 














Glu 


Gly 


Leu 


He 




420 










425 








430 






Pro 


Ala Pro 
435 


Pro 


Val 


Pro 


Glu 


Thr 
440 


Thr 


Pro 


Thr Pro 


JAR 


Thr 


Lys 


Thr 


Gly 


Thr Trp 


Glu 


Ala 


Pro 


Val 


Asp 


Ser 


Ser 


r ART! 


G y 


Leu 


U 




450 








455 


















Asp 


Val Ala 


Gin 


Leu 


Asn 






Gl 


Gl 


Asn Trp 


Ser 


Pro 


Gly 


Gin 


465 








470 










475 








480 


Pro 


Ser Phe 


Leu 


Gin 
485 


Pro 


Arg 


Glu 


Leu 


Arg 
490 


Gly Met 


Pro 


Asn 




He 


His 


Met Gly 


Ala 


Gly 


Pro 


Pro 


Pro 


Gl 


Phe 


Asn Ar 
Asn Arg 


Met 


Glu 


Glu 


Met 




500 








505 








510 






Gly 


Val Gin 


Gly 


Gly 


Arg 


Ala 


Lys 


Arg 


Tyr 


Ser Ser 


Gin 


Arg 


Gin 


Arg 


515 










520 








525 








Pro 


Val Pro 
530 


Glu 


Pro 


Pro 


Ala 
535 


Pro 


Pro 


Val 


His He 
540 


Ser 


He 


Met 


Glu 


Gly 




Tyr 


Asp 


Pro 


Leu 


Gin 


Phe 


Gin 


Gly Pro 


He 


Tyr 


Thr 


His 


545 




550 










555 








560 


Gly 


Asp Ser 


Pro 


Ala 


Pro 


Leu 


Pro 


Pro 


Gin 


Gly Met 




Val 


Gin 


Pro 



565 570 575 



138 



EP 1 365 034 A2 



Gly Met 


Asn 


Leu Pro His Pro 


Gly Leu His 


Pro His Gin 


590 r ° * 




580 


585 






Pro Leu 


Pro 


Asn Pro Gly Leu 


Tyr Pro Pro 


Pro Val Ser 


Met S r Pro 
er ro 




595 




600 






Gly Gin 


Pro 


Pro Pro Gin Gin 


Leu Leu Ala 


Pro Tyr 


Phe Ser Ala 


610 




615 








Pro Gly 


Val 


Met Asn Phe Gly 


Asn Pro Ser 


Tyr ro Tyr 


Ala Pro Gly 


625 










640 


Ala Leu 


Pro 


Pro Pro Pro Pro 


Pro His Leu 


Tyr Pro Asn 


Thr Gin Ala 






645 


650 




655 


Pro Ser 


Gin 


Val Tyr Gly Gly 


Val Thr Tyr 


Tyr Asn Pro 


Ala Gin Gin 






660 


665 




670 


Gin Val 


Gin 


Pro Lys Pro Ser 


Pro Pro Arg 


Arg Thr Pro 


Gin Pro Val 




675 




680 


685 




Thr lie 


Lys 


Pro Pro Pro Pro 


Glu Val Val 


Ser Arg Gly 


Ser Ser 


690 


695 




700 




<210> > 


14 










<211> 560 










<212> 3 


?RT 










<213> 1 


iomo 


sapiens 








<400> 44 










Mat Pro 


Gin 


Thr Arg Ser Gin 


Ala Gin Ala 


r e er 


Phe Pro Lys 
S 15 


1 




5 








Arg Lys 


Leu 


Ser Arg Ala Leu 


Asn Lys Ala 


Asn Ser 
ys n er 


Ser Asp Ala 










30 


Lys Leu 


Glu 


Pro Thr Asn Val 


Gin Tnr val 


Ser 


Pro Arg Val 


35 




40 


1 45 




Lys Ala 


Leu 


Pro Leu Ser Pro 


Arg Lys Arg 


As 

SU 60 y P 


As Asn Leu 
P 


50 




55 










Thr 


Pro His Leu Pro 


Pro Cys Ser 


Pro Pro Lys 


Gin Gly Lys 


65 




70 




t 5 T Gl 


80 


Lys Glu 


Asn 


Gly Pro Pro His 


Ser His Tnr 


u ys y 


3 95^ 




85 


L S° 






Val Phe 


Asp 


Asn Gin Leu Thr 


8 Sr 


Pro Ser L s 

ro y 


Arg Glu Leu 




100 






110 


Ala Lys 


Val 


His Gin Asn Lys 


He Leu Ser 


Ser Val Arg 


Lys Ser Gin 


115 






125 




Glu lie 


Thr 


Thr Asn Ser Glu 


Gin Arg Cys 


140 Y 


Lys Glu Ser 


130 




135 








Ala Cys 


Val 


Arg Leu Phe Lys 


Gin Glu Gly 


155 ^ ^ 


Gin Gin Ala 


145 




150 






160 


Lys Leu 


Val 


Leu Asn Thr Ala 


Val Pro Asp 


Arg Leu Pro 


Ala Arg Glu 












Arg Glu 


Met 


Asp Val lie Arg 


Asn Phe Leu 


Arg Glu His 


He Cys Gly 




180 


185 




190 


Lys Lys 


Ala 


Gly Ser Leu Tyr 


Leu Ser Gly 


Ala Pro Gly 


Thr Gly Lys 


195 




200 


205 




Thr Ala 


Cys 


Leu Ser Arg lie 


Leu Gin Asp 


Leu Lys Lys 


Glu Leu Lys 


210 


215 




220 




Gly Phe 


Lys 


Thr lie Met Leu 


Asn Cys Met 


Ser Leu Arg 


Thr Ala Gin 


225 


230 




235 


240 



139 



EP 1 365 034 A2 



Ala 


Val 


Phe Pro 


Ala lie 


Ala Gin 




Cys n u 


Glu Val Ser 








245 




250 




255 


Arg 


Pro 


Ala Gly 


Lys Asp 


Met Met 


Arg ys 


Gl L S 
u u y 


His Met Thr 




260 










270 


Ala 


Glu 


Lys Gly 


Pro Met 




Leu Val 


° P 285 


Met Asp Gin 






275 




280 








Leu 


Asp 


Ser Lys 


Gly Gin 


Asp Val 


Leu Tyr 


Thr Leu Phe 


Gl Trp Pro 
u rp 




290 




295 








Trp 


Leu 


Ser Asn 


Ser Mis 


Leu Val 


Leu He 


Gly He Ala 


Asn Thr Leu 


305 






310 








320 


Asp 


Leu 


Thr Asp 


Arg lie 


Leu Pro 


Arg Leu 


Gin Ala Arg 


335 




325 










Lys 


Pro 




Leu Asn 


Phe Pro 




r 9 


Gin He Val 




340 












Thr 


lie 


Leu Gin 


Asp Arg 


Leu Asn 


Gin Va 


Ser Arg Asp 


Gin Val Leu 






355 






365 




Asp 


Asn 


Ala Ala 


Val Gin 


Phe Cys 


Ala Arg 


^flO 


Ala Val Ser 


370 






375 








Gly 


Asp 


Val Arg 


Lys Ala 


Leu Asp 


Val Cys 




He Glu He 


385 




390 








400 


Val 


Glu 


Ser Asp 


Val Lys 


Ser Gin 


Thr lie 


Leu Lys Pro 


L ° U 415 






405 










Cys 


Lys 


Ser Pro 


Ser Glu 


Pro Leu 


He Pro 


ys g a 


Gly Leu He 


420 






425 




430 


His 


lie 


Ser Gin 


Val He 




Val Asp 


Gly Asn Arg 


Met Thr Leu 






435 




440 




of 5 




Ser 


Gin 


Glu Gly 


Ala Gin 


Asp Ser 


Phe Pro 


Leu Gin Gin 


L s He Leu 




450 




455 




Le L^° He 




Val 


Cys 


Ser Leu 


Met Leu 


Leu He 


Arg G n 


u ys e 


Lys Glu Val 


465 




470 








480 


Thr 


Leu 


Gly Lys 


Leu Tyr 


Glu Ala 


490 


Lys a cys 


9 495 
















Gin 


Val 


Ala Ala 


Val Asp 


Gin Ser 


Glu Cys 


Leu Ser Leu 


Ser Gly Leu 






500 




505 




510 


Leu 


Glu 


Ala Arg 


Gly He 


Leu Gly 


Leu Lys 


Arg Asn Lys 


Glu Thr Arg 






515 


520 




525 




Leu 


Thr 


Lys Val 


Phe Phe 


Lys He 


Glu Glu 


Lys Glu lie 


Glu His Ala 




530 




535 




540 




Leu 


Lys 


Asp Lys 


Ala Leu 


He Gly 


Asn He 


Leu Ala Thr 


Gly Leu Pro 


545 


550 






555 


560 


<210> 45 












<211> 462 












<212> PRT 












<213> 1 


lono sapiens 











<400> 45 

Met Ala Ser Asn Ser Ser Ser Cys Pro Thr Pro Gly Gly Gly His Leu 

15 10 15 

Asn Gly Tyr Pro Val Pro Pro Tyr Ala Phe Phe Phe Pro Pro Met Leu 

20 25 30 

Gly Gly Leu Ser Pro Pro Gly Ala Leu Thr Thr Leu Gin His Gin Leu 



140 



EP 1 365 034 A2 



Pro 


Val 


Ser 


Gly Tyr 


Ser 


Thr 


Pro Ser 


Pro Ala Thr 


He 


Glu 


Thr 


Gin 




50 








55 




60 










Ser 


Ser 


Ser 


Ser Glu 


Glu 


He 


Val Pro 


Ser Pro Pro 


Ser 


Pro 


Pro 


nn° 


65 








70 






75 










Leu 


Pro 


Arg 


He Tyr 


Lys 


Pro 


Cys Phe 


Val Cys Gin 
90 


Asp 


Lys 




Ser 


Gly 


Tyr 


His 


85 
Tyr Gly 


Val 


Ser 


Ala Cys 


Glu Gly Cys 


Lys 




Phe 


Phe 




100 






105 






110 






Arg 


Arg 


Ser 


He Gin 


Lys 


Asn 


Met Val 


Tyr Thr Cys 


His 


Arg 


Asp 


Lys 


115 








120 




125 








Asn 


Cys 


He 


He Asn 


Lys 


Val 


Thr Arg 


Asn Arg Cys 


Gin 


Tyr 


Cys 


Arg 




130 








135 




140 










Leu 


Gin 


Lys 


Cys Phe 


Glu 


Val 


Gly Met 


Ser Lys Glu 


Ser 


Val 


Arg 




145 




150 






155 








160 


Asp 


Arg 


Asn 


Lys Lys 


Lys 


Lys 


Glu Val 


Pro Lys Pro 


Glu 


Cys 




Glu 




165 








170 










Ser 


Tyr 


Thr 


Leu Thr 


Pro 


Glu 


Val Gly 


Glu Leu He 


Glu 


Lys 


Val 


Arg 






180 






185 






190 






Lys 


Ala 


His 


Gin Glu 


Thr 


Phe 


Pro Ala 


Leu Cys Gin 


Leu 


Gly 


Lys 


Tyr 




195 








200 




205 








Thr 




Asn 


Asn Ser 


Ser 


Glu 


Gin Arg 


Val Ser Leu 


Asp 


He 


Asp 


Leu 




210 








215 




220 










Trp 


Asp 


Lys 


Phe Ser 


Glu 


Leu 


Ser Thr 


Lys Cys He 


He 


Lys 


Thr 


225 






230 






235 








240 


Glu 


Phe 


Ala 


Lys Gin 


Leu 


Pro 


Gly Phe 


Thr Thr Leu 


Thr 


He 


Ala 


Asp 








245 








250 






255 




Gin 


He 


Thr 


Leu Leu 


Lys 


Ala 


Ala Cys 


Leu Asp He 


Leu 




Leu 


Arg 








260 




265 












lie 


Cys 


Thr 


Arg Tyr 


Thr 


Pro 


Glu Gin 


Asp Thr Met 




Phe 


Ser 


Asp 




275 






280 




285 








Gly 


Leu 


Thr 


Leu Asn 


Arg 


Thr 


Gin Met 


His Asn Ala 


Gly 


Phe 


Gly 


Pro 


290 








295 




300 










Leu 


Thr 


Asp 


Leu Val 


Phe 


Ala 


Phe Ala 


Asn Gin Leu 


Leu 


Pro 


Leu 




305 






310 






315 








320 


Met 


Asp 


Asp 


Ala Glu 


Thr 


Gly 


Leu Leu 


Ser Ala He 


Cys 


Leu 




Cys 




325 








330 










Gly 


Asp 


Arg 


Gin Asp 


Leu 


Glu 


Gin Pro 


Asp Arg Val 


Asp 


Met 


Leu 


G n 


340 






345 






350 






Glu 


Pro 


Leu 


Leu Glu 


Ala 


Leu 


Lys Val 


Tyr Val Arg 


Lys 


Arg 


Arg 


Pro 






355 








360 




365 








Ser 


Arg 


Pro 


His Met 


Phe 


Pro 


Lys Met 


lieu Met Lys 


He 


Thr 


Asp 


Leu 




370 








375 




380 










Arg 


Ser 


He 


Ser Ala 


Lys 


Gly 


Ala Glu 


Arg Val He 


Thr 


Leu 


ys 


M 


385 








390 














400 








Gly Ser 




Pro 


Pro Leu 


He Gin Glu 


Met 


Leu 


Glu 


Asn 








405 








410 






415 




Ser 


Glu 


Gly 


Leu Asp 


Thr 


Leu 


Ser Gly 


Gin Pro Gly 


Gly 


Gly 


Gly 


Arg 






420 






425 






430 






Asp 


Gly 


Gly 


Gly Leu 


Ala 


Pro 


Pro Pro 


Gly Ser Cys 


Ser 


Pro 


Ser 


Leu 


435 








440 




445 








Ser 


Pro 


Ser 


Ser Asn 


Arg 


Ser 


Ser Pro 


Ala Thr His 


Ser 


Pro 








450 








455 




460 











55 



141 



EP 1 365 034 A2 



<210> 46 

<211> 1531 

<212> PRC 

<2l3> Home sapiens 



<400> 46 








Mat 


Glu 


Val 


Sar 


Pro Lao Gin 


Pro 


l 








5 




Lya 


Xla 


Lys 


Lys 


Asn Glu Aap 


Ala 








20 






Xla 


Xyr 


Gin 


Lys 


Lys The Gin 


Lau 






3S 






40 


TKr 


Tyr 


lie 


Gly 


Sar Val Glu 


Lau 




50 






55 




Aap 


Glu 


Asp 


Val 


Gly Xla Aan 


Tyr 


65 








70 




leu 


Tyr 


Lya 


Xla 


Pha Aep Glu 


He 










B5 




Cln 


Arg 


Asp 


Pro 


Lya Met Sar 


cy- 








100 






Asn 


Asn 


Leu 


Ila 


Sar tie Trp 


Asn 






115 






120 


Glu 


His 


Lys 


Val 


Glu Lys Mat 


Tyr 




130 










IAU 


Lau 


Thr 


Sar 


Sar Aan Xyr 


A8P 


145 








150 




Gly 


Arg 


Asn 


Gly 


Tyr Gly Ala 


Lys 










165 




Phe 


The 


val 


Glu 


Thr Ala Sar 


Arg 








180 






Thr 


Isp 


Met 


Asp 


Asn Mat Gly 


Arg 






195 






200 


Phe 


Asn 


Gly 


GlU 


Aap '-Tyr The 


Cys 




210 






215 




Lya 


Pha 


Lye 


Hat 


Gin See lau 


Asp 


225 








230 




Arg 


Arg 


Ala 


Tyr 


Asp Xla Ala 


Gly 








245 




Lau 


Asn 


Gly 


Aan 


lys Lau Pro 


Val 






260 






Mat 


Tyr 


Lau 
275 


Lya 


Aap Lya Leu 


Asp 
280 


He 


Hia 


Glu 


Gin 


Val Asn His 


Arg 




290 






295 




Glu 


Lys 


Gly 


Phe 


Gin Gin Xla 


Sar 


305 




310 




lya 


Gly 


Gly 


Arg 


His Val Asp 


Tyr 






325 




Leu 


Val 


Asp 


Val 


Val Lys Lya 


Lys 








340 




Ala 


Bis 


Sin 


Val 


Lys Asn His 


Hat 






355 




360 


Glu 


Asn 


Pro 


The 


Phe Asp Sar 


Cln 




370 






375 





Val Aan 


Glu Asn 


Met 


Gin 


Val 


Asn 


10 








15 




Lys Lys 


Arg Leu 


See 


Val 


Glu 


Arg 


25 






30 






Glu His 


lie Leu 


Lau 


Arg 


Pea 


Asp 






45 








Val The 


Gin Gin 




Tcp 


Val 


Tyr 




SO 










Arg Glu 


Val Thr 


Phe 


Val 


Pro 


Gly 


75 








80 


Leu Val 


Aan Ala 


Ala 


Aap 


Aan 


Lys 


90 






9S 


Xla Arg 


Val The 


Xla 


Asp 


Pro 


Glu 


105 






no 






Aan Gly 


Lys Gly 


Xla 


Pro 


Val 


Val 






125 








Val Pro 


Ala Lau 


Ila 


Phe 


Gly 


Gin 




140 










Asp Asp 


Glu Lys 


Lys 


Val 


Thr 


Gly 




155 








160 


Lau cys 


Asn Xle 


Pha 


Sar 


Thr 


Lys 


170 








175 




Glu Tyr 


Lys Lys 


Met 


Phe 


Lya 


Gin 


185 






190 






Ala Gly 


Glu Mat 


Glu 
205 


Lau 


Lys 


Pro 


lie The 


Pha Gin 


Pro 


Asp 


Leu 


Sar 




220 










Lys Asp 


Xle Val 


Ala 


Leu 


Mat 


val 


235 








240 


Sar The 


Lys Asp 


Val 


Lya 


Val 


Pha 


250 








255 




Lys Gly 


Pha Arg 


Sar 


Tyr 


Val 


Asp 


265 






270 






Glu Thr 


Gly Asn 


Sar 


Lau 


Lys 


Val 






265 








Trp Glu 


Val Cys 


Lau 


Thr 


Mat 


Sar 




300 










Phe val 


Asn See 


Xla 


Ala 


Thr 


Sar 




315 








320 


Val Ala 


Asp Gin 


lie 


val 


Thr 


Lys 


330 








335 




Asn Lys 


Gly Gly 


Val 


Ala 


Val 


Lys 


345 




350 






Sep Xla 


Phe Val 


Asn 


Ala 


Leu 


lie 




365 








Thr Lys 


Glu Aan 




The 


Leu 




380 











142 



EP 1 365 034 A2 



385 
Ala 

Phe 

Asn 

Gly 

Ala 
4SS 

Tyr 

Ser 
He 
Iff* 



tie 
Pro 
Lya 



Lya Ser ghe 

Ala lie Gly 

Lys Ala Gin 
420 

Asg lis Lys 

435 
Arg Asn Sec 
450 

Lya Thr Leu 

Gly Val She 

His Lya Gin 
500 

Val Gly lAiu 

515 
Thr Lau Aug 
530 

Gly Sar His 



Val Xtf * v al 
580 

Glu Phe Glu 
595 

Val lya Tyr 
610 

Gin Tyr Phe 



Gly Ser 
390 
Cys Gly 
405 

Val Gin 
Sly lie 
Thr Gin 



Tor Cys Gin 

He Val Glu 

Leu Asn Lys 
425 

pro lys Len 

440 

Cys Thr Leu 
455 



Leu Ser Glu Lys 



395 
: lie 1 



Ala Val 
470 

Pro Leu Arg Gly Lys 
485 

He Mat 



Gin Syr 
tyr Gly 

II* Lya 

550 
Leu Arg 
5E5 

Sar Lys 
Glu Srp 
Tyr Lys 



Ser Gly Pro Glu 

Gin lie Asp Asp 
660 

Arg Gin Arg Lys 
675 

Thr Thr Thr Tyr 



705 
Gly 



Glu 
Ha 



Lau Lys Pro 

Asp Lys Arg 
740 

Nat Ser Ser 

755 
Asn Leu Ala 
770 

Pro He Gly 



i Phe Pro Pro 
620 

i Gin Arg vaa 



Asp Asp 
645 

Arg Lys 



710 
Gly Gin 
725 

Glu Val 

Tyr His 

Gin Asn 

Gin Phe 
790 
Tyr He 
B05 

Lys Aap 
clu Pro 



Glu Aan Ala 
50S 

Lya Lys Asa 

520 
Lys lie Met 
535 

Gly Lau Lau 

Sis Arg Phe 

Asn Lys Gin 
585 

Lys Ser Ser 

600 
Gly Lau Gly 
616 

Mat Lys Arg 

Ala Ala lis S 
f 

Glu Trp Leu 1 
6G5 

Gly Leu Pro C 
630 

Tyr Aan Asp I 
695 

Asn Glu Arg £ 

Arg Lya Val I 

1 

Lys Val Ala G 
745 

His Gly Glu 

760 
Phe Val Gly 
775 

Gly Thr Arg 



Ser Ala 

Ala Asn 
445 

Thr Glu 
460 

val Gly 
Asn Val 
Asn Asn 



lie * 

He J 

Leu 
570 
Glu 

Ear 

Thr 

His 



SS5 
Glu C 



525 
: Thr Asp 
540 
I Ha 



* Asn His 
605 
: Thr Ser 

620 
r Ha Gin 

i Ala Phe 

i Phe Met 

> Tyr Leu 
695 
i Aan Lys 

700 
i Pro Ser 

i Thr Cys 

t Ala Gly 



Ser Asn Asn Leu 
780 

Lau Hi" Gly Gly 



Phs He Lys 
400 

Trp Val Lya 

415 
Val Lya Bis 
430 

Asp Ala Gly 

Gly Asp Sez 

Arg Asp Lys 
480 

Arg Glu Ala 

495 
He He Lys 
510 

Asp Bar Lau 

Gin Asp Gin 

His His Asn 
560 

lie Thr Pro 

575 
Tyr Ser Ian 
590 

Lya Lys Trp 

Lys Glu Ala 

She Lys Tyr 
640 

Ser Lys Lys 
655 

Glu Asp Arg 
670 

Tyr Gly Gin 

Glu Leu He 

Mat Val Aap 
720 

Pha Lys Arg 

735 
Ser Val Ala 
750 

Met Thr He 
Asn Leu Leu 
Lys Asp Sar 



Asp His Thr 
825 

Glu Srp Tyr 
B40 



aio 

Leu I 
He I 



815 

i Phe Leu Tyr Asp Asp 
330 

> He He Pro Met Val 
845 



143 



EP 1 365 034 A2 



He 


Asn 


Gly 


Ala 


Glu 


Gly 


He 


Gly 


Thr 


Gly 


Trp 


Ser 


Cys 


Lys 


He 


850 








855 










860 










Asn 


Phe 


Asp 


Val 


Arg 


Glu 


He 


Val 


Asn 


Asn 


He 


Arg 


Arg 


Leu 


Met 








870 










875 










880 


Gly 


Glu 


Glu 


Pro 


Leu 


Pro 


Met 


Leu 


Pro 


Ser 


Tyr 


Lys 


Asn 


Phe 


Lys 






885 










890 










895 




Thr 


He 


Glu 

900 


Glu 


Leu 


Ala 


Pro 


Asn 
905 


Gin 


Tyr 


Val 


He 


Ser 
910 


Gly 


Glu 


Ala 


He 
915 


Leu 


Asn 


Ser 


Thr 


Thr 
920 


He 


Glu 


He 


Ser 


Glu 
925 


Leu 


Pro 


Val 


Thr 


Trp 


Thr 


Gin 


Thr 


Tyr 


Lys 


Glu 


Gin 


Val 


Leu 


Glu 


Pro 


Met 


Leu 


930 








935 




















Gly 


Thr 


Glu 


Lys 


Thr 


Pro 


Pro 


Leu 


He 


Thr 


Asp 


Tyr 


Arg 


Glu 


Tyr 






950 










955 










960 


Thr 


Asp 


Thr 


Thr 


Val 




Phe 


Val 


Val 


Lys 


Met 


Thr 


Glu 


Glu 


Lys 






965 










970 










975 




Ala 


Glu 


Ala 
9B0 


Glu 


Arg 


Val 


Gly 


Leu 
985 


His 


Lys 


Val 


Phe 


990 


Leu 


Gin 



Thr Ser Leu Thr Cys Asn Ser Met Val Leu Phe Asp His Val Gly Cys 
995 1000 1005 



Leu 


Lys 


Lys 


Tyr 


Asp 


Thr 


Val 


Leu Asp 


He Leu Arg 


Asp 


Phe 


Phe 




1010 








1015 




1020 








Glu 


Leu 


Arg 


Leu 


Lys 


Tyr 


Tyr 


Gly Leu 


Arg Lys Glu 


Trp 


Leu 


Leu 




1025 








1030 




1035 








Gly 


Met 


Leu 


Gly 


Ala 


Glu 


Ser 


Ala Lys 


Leu Asn Asn 


n 


a 


Arg 


1040 








1045 




1050 








Phe 


He 


Leu 


Glu 


Lys 


He 


Asp 


Gly Lys 


He He He 


Glu 


Asn 


Lys 




1055 










1060 




1065 








Pro 


Lys 


Lys 


Glu 


Leu 


He 


Lys 


Val Leu 


He Gin Arg 


Gly 


Tyr 


Asp 




1070 








1075 




1080 








Ser 


Asp 


Pro 


Val 


Lys 


Ala 


Trp 


Lys Glu 


Ala Gin Gin 


Lys 


Val 


Pro 
















1095 








Asp 


Glu 


Glu 


Glu 


Asn 


Glu 


Glu 


Ser Asp 


Asn Glu Lys 


Glu 


Thr 


Glu 


1100 










1105 




1110 








Lys 


Ser 


Asp 


Ser 


Val 


Thr 


Asp 


Ser Gly 


Pro Thr Phe 


Asn 


Tyr 


Leu 


1115 








1120 




1125 






Glu 


Leu 


Asp 


Met 


Pro 


Leu 


Trp 




Leu Thr 


Lys Glu Lys 


Lys 


Asp 




1130 










1135 




1140 








Leu 


Cys 


Arg 


Leu 


Arg 


Asn 


Glu 


Lys Glu 


Gin Glu Leu 


Asp 


Thr 


Leu 




1145 








1150 




1155 








Lys 


Arg 


Lys 


Ser 


Pro 


Ser 


Asp 


Leu Trp 


Lys Glu Asp 


Leu 


Ala 


Thr 


1160 










1165 




1170 








Phe 


He 


Glu 


Glu 


Leu 


Glu 


Ala 


Val Glu 


Ala Lys Glu 


Lys 


Gin 


Asp 




1175 










1180 




1185 








Glu 


Gin 


val 


Gly 


Leu 


Pro 


Gly 


Lys Gly 


Gly Lys Ala 


Lys 


Gly 


Lys 




1190 








1195 




1200 








Lys 


Thr 


Gin 


Met 


Ala 


Glu 


Val 


Leu Pro 


Ser Pro Arg 


Gly 


Gin 


Arg 
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He 
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Glu 
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Lys 
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Asn Thr Glu 
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Glu 
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Glu 
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Ser 


Asp Ser Glu Ser Asp 


Arg 


Ser Ser Asp 


Glu 


Ser Asn 


Phe 


1295 
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Val 


Pro Pro Arg Glu Thr 


Glu 


Pro Arg Arg 


Ala 


Ala Thr 
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Thr 


Lys 


Phe Thr Met Asp Leu 
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Ser Asp Glu 
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Phe Ser 


Asp 
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1335 






Phe 


Asp 


Glu Lys Thr Asp Asp 


Glu 
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Ser Asp 


Ala 
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1345 






1350 






Ser 


Pro 


Pro Lys Thr Lys Thr 


Ser 


Pro Lys Leu 


Ser 


Asn Lys 


Glu 




1355 


1360 






1365 






Leu 


Lys 


Pro Gin Lys Ser Val 


Val 


Ser Asp Leu 


Glu 


Ala Asp 


Asp 




1370 


1375 






1380 






Val 


Lys 


Gly Ser Val Pro Leu 


Ser 


Ser Ser Pro 


Pro 


Ala Thr 


His 




1385 
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1395 






Phe 


Pro 


Asp Glu Thr Glu lie 


Thr 


Asn Pro Val 


Pro 


Lys Lys 


Asn 




1400 


1405 






1410 
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Thr 


Val Lys Lys Thr Ala 


Ala 


Lys Ser Gin 


Ser 


Ser Thr 


Ser 




1415 


1420 






1425 






Thr 


Thr 


Gly Ala Lys Lys Arg 


Ala 


Ala Pro Lys 


Gly 


Thr Lys 


Arg 




1430 


1435 






1440 






Asp 


Pro 


Ala Leu Asn Ser Gly 


Val 


Ser Gin Lys 


Pro 


Asp Pro 


Ala 


1445 


1450 






1455 






Lys 


Thr 


Lys Asn Arg Arg Lys 


Arg 


Lys Pro Ser 


Thr 


Ser Asp 


Asp 


1460 


1465 






1470 






Ser 


Asp 


Ser Asn Phe Glu Lys 


lie 


Val Ser Lys 


Ala 


Val Thr 


Ser 




1475 


1480 






1485 






Lys 


Lys 


Ser Lys Gly Glu Ser 


Asp 


Asp Phe His 


Met 
1500 


Asp Phe 


Asp 


1490 


1495 










Ser 


Ala 


Val Ala Pro Arg Ala 


Lys 


Ser Val Arg 


Ala 


Lys Lys 


Pro 




1505 


1510 






1515 






lie 


Lys 


Tyr Leu Glu Glu Ser 


Asp 


Glu Asp Asp 


Leu 


Phe 






1520 


1525 






1530 
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<212> PRT 

<213> Homo sapiens 



<400> 47 














Met 


Leu 


Pro Leu Cys 


Leu Val 


Ala 


Ala 


Leu 


Leu Leu Ala 


Ala Gly Pro 


1 




5 








10 




15 


Gly 


Pro 


Ser Leu Gly 


Asp Glu 


Ala 


lie 


His 


Cys Pro Pro 


Cys Ser Glu 




20 






25 






30 


Glu 


Lys 


Leu Ala Arg 


Cys Arg 


Pro 


Pro 


Val 


Gly Cys Glu 


Glu Leu Val 




35 




40 






45 




Arg 


Glu 


Pro Gly Cys 


Gly Cys 


Cys 


Ala 


Thr 


Cys Ala Leu 


Gly Leu Gly 


50 




55 








60 




Met 


Pro 


Cys Gly Val 


Tyr Thr 


Pro Arg 


Cys 


Gly Ser Gly 


Leu Arg Cys 


65 




70 








75 


80 


Tyr 


Pro 


Pro Arg Gly 
85 


Val Glu 


Lys 


Pro 


Leu 
90 


His Thr Leu 


Met His Gly 
95 


Gin 


Gly 


Val Cys Met 


Glu Leu 


Ala 


Glu 


He 


Glu Ala He 


Gin Glu Ser 




100 






105 






110 
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Leu Gin Pro Ser Asp Lys Asp Glu GXy Asp His Pro Asn Asn Ser Phe 

115 120 125 

Ser Pro Cys Ser Ala His Asp Arg Arg Cys Leu Gin Lys His Phe Ala 

130 135 140 

Lys lie Arg Asp Arg Ser Thr Ser Gly Gly Lys Met Lys Val Asn Gly 
145 150 155 160 

Ala Pro Arg Glu Asp Ala Arg Pro Val Pro Gin Gly Ser Cys Gin Ser 

165 170 175 

Glu Leu His Arg Ala Leu Glu Arg Leu Ala Ala Ser Gin Ser Arg Thr 

180 185 190 

His Glu Asp Leu Tyr lie He Pro He Pro Asn Cys Asp Arg Asn Gly 

195 200 205 

Asn Phe His Pro Lys Gin Cys His Pro Ala Leu Asp Gly Gin Arg Gly 

210 215 220 

Lys Cys Trp Cys Val Asp Arg Lys Thr Gly Val Lys Leu Pro Gly Gly 
225 230 235 240 

Leu Glu Pro Lys Gly Glu Leu Asp Cys His Gin Leu Ala Asp Ser Phe 
245 250 255 

Arg Glu 



<210> 48 

<211> 378 

<212> PRT 

<213> Homo sapiens 



<400> 48 , 

Met Asp Leu Gly Lys Pro Met Lys Ser Val Leu Val Val Ala Leu Leu 

15 10 15 

Val He Phe Gin Val Cys Leu Cys Gin Asp Glu Val Thr Asp Asp Tyr 

20 " 25 30 

He Gly Asp Asn Thr Thr Val Asp Tyr Thr Leu Phe Glu Ser Leu Cys 

35 40 45 

Ser Lys Lys Asp Val Arg Asn Phe Lys Ala Trp Phe Leu Pro He Met 

50 55 60 

Tyr Ser He He Cys Phe Val Gly Leu Leu Gly Asn Gly Leu Val Val 
6 | 70 75 80 

Leu Thr Tyr He Tyr Phe Lys Arg Leu Lys Thr Met Thr Asp Thr Tyr 

85 90 95 

Leu Leu Asn Leu Ala Val Ala Asp He Leu Phe Leu Leu Thr Leu Pro 

100 105 HO 

Phe Trp Ala Tyr Ser Ala Ala Lys Ser Trp Val Phe Gly Val His Phe 

115 120 125 

Cys Lys Leu He Phe Ala He Tyr Lys Met Ser Phe Phe Ser Gly Met 

130 135 140 

Leu Leu Leu Leu Cys He Ser He Asp Arg Tyr Val Ala He Val Gin 
145 150 155 160 

Ala Val Ser Ala His Arg His Arg Ala Arg Val Leu Leu He Ser Lys 

165 170 175 

Leu Ser Cys Val Gly He Trp He Leu Ala Thr Val Leu Ser He Pro 

180 185 190 

Glu Leu Leu Tyr Ser Asp Leu Gin Arg Ser Ser Ser Glu Gin Ala Met 
195 200 205 
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Arg Cys Ser Leu lie Thr Glu His Val Glu Ala Phe He Thr He Gin 

210 215 220 

Val Ala Gin Met Val He Gly Phe Leu Val Pro Leu Leu Ala Met Ser 
225 230 235 240 

Phe Cys Tyr Leu Val He He Arg Thr Leu Leu Gin Ala Arg Asn Phe 

245 250 255 

Glu Arg Asn Lys Ala He Lys Val He He Ala Val Val Val Val Phe 

260 265 270 

He Val Phe Gin Leu Pro Tyr Asn Gly Val Val Leu Ala Gin Thr Val 

275 280 285 

Ala Asn Phe Asn He Thr Ser Ser Thr Cys Glu Leu Ser Lys Gin Leu 

290 295 300 

Asn He Ala Tyr Asp Val Thr Tyr Ser Leu Ala Cys Val Arg Cys Cys 
305 310 315 320 

Val Asn Pro Phe Leu Tyr Ala Phe He Gly Val Lys Phe Arg Asn Asp 

325 330 335 

Leu Phe Lys Leu Phe Lys Asp Leu Gly Cys Leu Ser Gin Glu Gin Leu 

340 345 350 

Arg Gin Trp Ser Ser Cys Arg His He Arg Arg Ser Ser Met Ser Val 

355 360 365 

Glu Ala Glu Thr Thr Thr Thr Phe Ser Pro 
370 375 

<210> 49 

<211> 411 

<212> PRT 

<213> Homo sapiens 



<400> 49 

Met Ser Lys Arg Pro Ser Tyr Ala Pro Pro Pro Thr Pro Ala Pro Ala 

15 io 15 

Thr Gin Met Pro Ser Thr Pro Gly Phe Val Gly Tyr Asn Pro Tyr Ser 

20 25 30 

His Leu Ala Tyr Asn Asn Tyr Arg Leu Gly Gly Asn Pro Ser Thr Asn 

35 40 45 

Ser Arg Val Thr Ala Ser Ser Gly He Thr He Pro Lys Pro Pro Lys 

50 55 60 

Pro Pro Asp Lys Pro Leu Met Pro Tyr Met Arg Tyr Ser Arg Lys Val 
65 70 75 80 

Trp Asp Gin Val Lys Ala Ser Asn Pro Asp Leu Lys Leu Trp Glu He 

85 90 95 

Gly Lys He He Gly Gly Met Trp Arg Asp Leu Thr Asp Glu Glu Lys 

100 105 HO 

Gin Glu Tyr Leu Asn Glu Tyr Glu Ala Glu Lys He Glu Tyr Asn Glu 

115 120 125 

Ser Met Lys Ala Tyr His Asn Ser Pro Ala Tyr Leu Ala Tyr He Asn 

130 135 140 

Ala Lys Ser Arg Ala Glu Ala Ala Leu Glu Glu Glu Ser Arg Gin Arg 
145 150 155 160 

Gin Ser Arg Met Glu Lys Gly Glu Pro Tyr Met Ser He Gin Pro Ala 

165 170 175 

Glu Asp Pro Asp Asp Tyr Asp Asp Gly Phe Ser Met Lys His Thr Ala 
180 185 190 
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Thr Ala 


Arg 


Phe Gin 


Arg 


Asn 
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Glu Ser 


Val 


Val Pro 


Asp 


Val 
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210 
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Gin Val 


Leu 


Lys Arg 


Gin 


Val 




225 






230 








Ala 


Glu Leu 


Leu 


Gin 








245 






10 


Arg Lys 
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Leu Glu 


Ser 


Thr 






260 
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Gly 


Leu Lys 


Val 


Glu 






275 
















Glu 


Gin 




290 








295 


15 


Lys Glu 


Ala 


Ala Glu 
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Ala 




305 






310 






Glu Glu 


Glu 


Gin Ala 


Ala 


Asn 








325 








Asn lie 


Pro 


Met Glu 


Thr 


Glu 








340 






20 


Ser Gin 


Gin 


Asn Gly 


Glu 


Glu 






355 










Ser Gly 


Gin 


Glu Gly 


Val 


Asp 




370 












Ser Asn 


Thr 


Gly Ser 


Glu 


Ser 




385 












Thr Asp 


Pro 


He Pro 


Glu 


Asp 






405 








<210> 50 










<211> 593 










<212> 1 


?RT 










<213> 1 


Jomo 


sapiens 






35 














<400> 50 
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Arg Tyr 
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Ser 
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Gly Gly 
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Gly 


Gly 
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He Ser 


Ser 


Ser 






35 










Ser Gly 


Gly 


Phe Ser 


Gly 


Gly 


45 


50 








55 




Gly Cys 
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Gly Gly 


Ser 


Ser 




65 
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Gly Gly 


Gly 


Ser Phe 


His 


Gly 








85 






50 


Ser Tyr 


Gly 


Gly Ser 


Phe 


Gly 








100 








Gly Gly 


Gly 


Ser Phe 


Gly 


Gly 






115 










Gly Gly 
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Gly Gly 


Gly 


Phe 


55 


130 








135 
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His 


Arg Leu 


He Ser 


Glu He Leu Ser 


200 






205 


Arg 


Ser Val 


Val Thr 


Thr Ala Arg Met 




220 




Gin 


Ser Leu 


Met Val 


His Gin Arg Lys 






235 


240 


He 


Glu Glu 


Arg His 


Gin Glu Lys Lys 




250 




255 


Asp 


Ser Phe 


Asn Asn 


Glu Leu Lys Arg 


265 




270 


Val 


Asp Met 


Glu Lys 


He Ala Ala Glu 


280 






285 


Ala 


Arg Lys 


Arg Gin 


Glu Glu Arg Glu 






300 




Glu 


Arg Ser 


Gin Ser 


Ser He Val Pro 




315 


320 


Lys 


Gly Glu 


Glu Lys 






330 




335 


Glu 


Thr His 


Leu Glu 


Glu Thr Thr Glu 




345 




350 


Gly 


Thr Ser 


Thr Pro 


Glu Asp Lys Glu 


360 






365 


Ser 


Mat Ala 


Glu Glu 
380 


Gly Thr Ser Asp 


Asn 


Ser Ala 


Thr Val 


Glu Glu Pro Pro 






395 


400 


Glu 


Lys Lys 


Glu 






410 






Ser 




r Ser 
Tyr Ser 


Ser Ser Arg Ser 




Y 10 




15 


Gly 


Cys Gly 


Gly Gly 


Gly Gly Val Ser 




25 




30 


Lys 


Gly Ser 


Leu Gly 


Gly Gly Phe Ser 


40 






45 


Ser 


Phe Ser 


Arg Gly 
60 


Ser Ser Gly Gly 


Gly 


Gly Tyr 


Gly Gly 


Leu Gly Gly Phe 






75 


80 


Ser 


Tyr Gly 


Ser Ser 


Ser Phe Gly Gly 




90 




95 


Gly 


Gly Asn 


Phe Gly 


Gly Gly Ser Phe 




105 




110 


Gly 


Gly Phe 


Gly Gly 


Gly Gly Phe Gly 


120 
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Gly 


Gly Asp 


Gly Gly 


Leu Leu Ser Gly 



140 
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Asn Glu Lys Val Thr Met Gin Asn Leu Asn Asp Arg Leu Ala Ser Tyr 
145 150 155 160 

Leu Asp Lys Val Arg Ala Leu Glu Glu Ser Asn Tyr Glu Leu Glu Gly 

165 170 175 

Lys lie Lys Glu Trp Tyr Glu Lys His Gly Asn Ser His Gin Gly Glu 

180 185 190 

Pro Arg Asp Tyr Ser Lys Tyr Tyr Lys Thr He Asp Asp Leu Lys Asn 

195 200 205 

Gin He Leu Asn Leu Thr Thr Asp Asn Ala Asn He Leu Leu Gin He 

210 215 220 

Asp Asn Ala Arg Leu Ala Ala Asp Asp Phe Arg Leu Lys Tyr Glu Asn 
225 230 235 240 

Glu Val Ala Leu Arg Gin Ser Val Glu Ala Asp He Asn Gly Leu Arg 

245 250 255 

Arg Val Leu Asp Glu Leu Thr Leu Thr Lys Ala Asp Leu Glu Met Gin 

260 265 270 

He Glu Ser Leu Thr Glu Glu Leu Ala Tyr Leu Lys Lys Asn His Glu 

275 280 285 

Glu Glu Met Lys Asp Leu Arg Asn Val Ser Thr Gly Asp Val Asn Val 

290 295 300 

Glu Met Asn Ala Ala Pro Gly Val Asp Leu Thr Gin Leu Leu Asn Asn 
305 310 315 320 

Met Arg Ser Gin Tyr Glu Gin Leu Ala Glu Gin Asn Arg Lys Asp Ala 

325 330 335 

Glu Ala Trp Phe Asn Glu Lys Ser Lys Glu Leu Thr Thr Glu He Asp 

340 345 350 

Asn Asn He Glu Gin He Ser Ser Tyr Lys Ser Glu He Thr Glu Leu 

355 360 365 

Arg Arg Asn Val Gin Ala Leu Glu He Glu Leu Gin Ser Gin Leu Ala 

370 375 380 

Leu Lys Gin Ser Leu Glu Ala Ser Leu Ala Glu Thr Glu Gly Arg Tyr 
385 390 395 400 

Cys Val Gin Leu Ser Gin He His Ala Gin He Ser Ala Leu Glu Glu 

405 410 415 

Gin Leu Gin Gin He Arg Ala Glu Thr Glu Cys Gin Asn Thr Glu Tyr 

420 425 430 

Gin Gin Leu Leu Asp He Lys He Arg Leu Glu Asn Glu He Gin Thr 

435 440 445 

Tyr Arg Ser Leu Leu Glu Gly Glu Gly Ser Ser Gly Gly Gly Gly Arg 

450 455 460 

Gly Gly Gly Ser Phe Gly Gly Gly Tyr Gly Gly Gly Ser Ser Gly Gly 
465 470 475 480 

Gly Ser Ser Gly Gly Gly Tyr Gly Gly Gly His Gly Gly Ser Ser Gly 

485 490 495 

Gly Gly Tyr Gly Gly Gly Ser Ser Gly Gly Gly Ser Ser Gly Gly Gly 

500 505 510 

Tvr Gly Gly Gly Ser Ser Ser Gly Gly His Gly Gly Gly Ser Ser Ser 

515 520 525 

Gly Gly His Gly Gly Ser Ser Ser Gly Gly Tyr Gly Gly Gly Ser Ser 

530 535 540 

Gly Gly Gly Gly Gly Gly Tyr Gly Gly Gly Ser Ser Gly Gly Gly Ser 
545 550 555 560 

Ser Ser Gly Gly Gly Tyr Gly Gly Gly Ser Ser Ser Gly Gly His Lys 

565 570 575 

Ser Ser Ser Ser Gly Ser Val Gly Glu Ser Ser Ser Lys Gly Pro Arg 
580 585 590 

Tyr 
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Ser 
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Gly 
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Gin He 
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Ala 


Arg 
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205 








Leu 


Ala 
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Glu 
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Leu 


Ala 


Leu 
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Arg 


Val 


Leu 
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Thr Arg Thr Asp Leu 


Glu 


Met Gin 


He 


Glu 


Ser 


Leu 










245 


250 
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Glu 


Glu 




Ala Tyr Met Lys Lys 
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Asp 


Glu 


Leu 


Gin 








260 


265 








270 






Ser 
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Arg 


Val 


Gly Gly Pro Gly Glu 


Val 
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Met 


Asp 


Ala 






275 




280 






285 








Ala 


Pro 
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Val 


Asp Leu Thr Arg Leu 


Leu 


Asn Asp 


Met 


Arg 


Ala 


Gin 




290 




295 




300 










Tyr 
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Thr 


He 


Ala Glu Gin Asn Arg 
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Glu 


Ala 


Trp 


Phe 


305 








310 




315 








320 


He 


Glu 
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Ser 


Gly Glu Leu Arg Lys 


Glu 


He Ser 


Thr 


Asn 


Thr 


Glu 








325 


330 








335 
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Leu 


Gin 


Ser 


Ser Lys Ser Glu Val 


Thr 
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Arg 


Arg 
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Phe 








340 


345 








350 
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355 




360 






365 










Glu 


Asp 


Ser 


Leu Ala Glu Ala Glu 
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370 




375 
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Ser Gin Val Gin Gin Leu lie Ser Asn Leu Glu Ala Gin Leu Leu Gin 
385 390 395 400 

Val Arcr Ala Asp Ala Glu Arg Gin Aan Val Asp His Gin Arg Leu Leu 

405 410 415 

Asn Val Lys Ala Arg Leu Glu Leu Glu lie Glu Thr Tyr Arg Arg Leu 

420 425 430 

Leu Asp Gly Glu Ala Gin Gly Asp Gly Leu Glu Glu Ser Leu Phe Val 

435 440 445 

Thr Asp Ser Lys Ser Gin Ala Gin Ser Thr Asp Ser Ser Lys Asp Pro 

450 455 460 

Thr Lys Thr Arg Lys lie Lys Thr Val Val Gin Glu Met Val Asn Gly 
465 470 475 480 

Glu Val Val Ser Ser Gin Val Gin Glu lie Glu Glu Leu Met 
485 490 

<210> 52 

<211> 361 

<212> PRT 

<213> Homo sapiens 



<400> 52 

Cys Asn Trp Phe Cys Glu Gly Ser Phe Asn Gly Ser Glu Lys Glu Thr 

15 10 15 

Met Gin Phe Leu Asn Asp Arg Leu Ala Ser Tyr Leu Glu Lys Val Arg 

20 25 30 

His Val Glu Arg Asp Asn Ala Glu Leu Glu Asn Leu lie Arg Glu Arg 

35 40 45 

Ser Gin Gin Gin Glu Pro Leu Leu Cys Pro Ser Tyr Gin Ser Tyr Phe 

50 55 60 

Lys Thr He Glu Glu Leu Gin Gin Lys He Leu Cys Ser Lys Ser Glu 
65 70 75 80 

Asn Ala Arg Leu Val Val Gin He Asp Asn Ala Lys Leu Ala Ala Asp 

85 90 95 

Asp Phe Arg Thr Lys Tyr Gin Thr Glu Gin Ser Leu Arg Gin Leu Val 

100 105 HO 

Glu Ser Asp He Asn Ser Leu Arg Arg He Leu Asp Glu Leu Thr Leu 

115 120 125 

Cys Arg Ser Asp Leu Glu Ala Gin Met Glu Ser Leu Lys Glu Glu Leu 

130 135 140 

Leu Ser Leu Lys Gin Asn His Glu Gin Glu Val Asn Thr Leu Arg Cys 
145 150 155 160 

Gin Leu Gly Asp Arg Leu Asn Val Glu Val Asp Ala Ala Pro Ala Val 

165 170 175 

Asp Leu Asn Gin Val Leu Asn Glu Thr Arg Asn Gin Tyr Glu Ala Leu 

180 185 190 

Val Glu Thr Asn Arg Arg Glu Val Glu Gin Trp Phe Ala Thr Gin Thr 

195 200 205 

Glu Glu Leu Asn Lys Gin Val Val Ser Ser Ser Glu Gin Leu Gin Ser 

210 215 220 

Tyr Gin Ala Glu He He Glu Leu Arg Arg Thr Val Asn Ala Leu Glu 
225 230 235 240 

He Glu Leu Gin Ala Gin His Asn Leu Arg Tyr Ser Leu Glu Asn Thr 
245 250 255 
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Leu Thr Glu Ser Glu Ala Arg Tyr Ser Ser Gin Leu Ser Gin Val Gin 

260 265 270 

Ser Leu lie Thr Asn Val Glu Ser Gin Leu Ala Glu lie Arg Ser Asp 

275 280 285 

Leu Glu Arg Gin Asn Gin Glu Tyr Gin Val Leu Leu Asp Val Arg Ala 

290 295 300 

Ara Leu Glu Cys Glu lie Asn Thr Tyr Arg Ser Leu Leu Glu Ser Glu 
305 310 315 320 

Asp Cys Lvs Leu Pro Ser Asn Pro Cys Ala Thr Thr Asn Ala Cys Glu 

325 330 335 

Lys Pro He Gly Ser Cys Val Thr Asn Pro Cys Gly Pro Arg Ser Arg 

340 345 350 

Cys Gly Pro Cys Asn Thr Phe Gly Tyr 
355 360 

<210> 53 

<211> 3282 

<212> DNA 

<213> Homo sapiens 



<400> 53 

atgaaggaga tggtaggagg ctgctgegta tgttoggaog agaggggctg ggccgagaac _ 
ccgctggtct actgogatgg gcacgcgtgc agcgtggccg tccacoaagc ttgetatggc 120 



ccgctggtct autywyai-yy yw«».--y'--s — a — —3—3-3 = — =» 

atcgttcagg tgccaacggg accctggttc tgccggaaat gtgaatctca ggagcgagca 
gccagggtga ggtgtgagct gtgcccacac aaagacgggg cattgaagag gactgataat 
ggaggctggg cacacgtggt gtgtgccctc tacatccccg aggtgcaatt tgccaacgtg 
ctcaccatgg agcccatcgt gctgcagtac gtgcctcatg afccgcttcaa caagacctgt 
tacatctgcg aggagacggg ccgggagagc aaggcggcct cgggagcctg catgacctgt 
aaccgccatg gatgtcgaca agctttccac gtcacctgtg cccaaatggc aggcttgctg 
tgtgaggaag aagtgctgga ggtggacaac gtcaagtact gcggotactg caaataccac 
ttcagcaaga tgaagacatc ccggcacagc agcgggggag gcggaggagg cgctggagga 
ggaggtggca goatgggggg aggtggcagt ggtttcatct ctgggaggag aagccggtca 
gcctcaccat ccacgcagca ggagaagcac cccacccacc acgagagggg ccagaagaag 
agtcgaaagg acaaagaacg ccttaagcag aagcacaaga agcggcctga gtcgcccccc 
agcatcctca cccogcocgt ggtccccact gctgacaagg tctcotcctc ggcttootct 
toctcccacc acgaggccag cacgcaggag acotctgaga gcagcaggga gtcaaagggg 
aaaaagtctt ceagccatag cctgagtcat aaagggaaga aactgagoag tgggaaaggt 
gtgagcagtt ttacctcogc ctcctottct tcctcctcct cttcctcctc ctctgggggg 
cccttccagc ctgcagtctc gtccctgcag agctccoctg acttctctgc attccccaag 
ctggagcagc cagaggagga oaagtactcc aagcccacag cccccgcccc ttcagcooot 
ccttctccct cagctcccga gccccccaag gctgaccttt ttgagcagaa ggtggtcttc 
tctggctttg ggoocatcat gcgcttctcc accaccacct ccagctcagg ccgggoccgg 
gcgccctccc ctggggacta taagtctccc cacgtcacgg ggtctggggc ctcggcaggc 
acccacaaac ggatgcccgc actgagtgcc acccctgtgc ctgctgatga gacccctgag 
acaggcctga aggagaagaa gcaoaaagcc agcaagagga gccgccatgg gccaggccgt 
cccaagggca gccggaacaa ggagggcact gggggcccag ctgococatc cttgcccagt 
acccagctgg ctggctttac cgcoactgct gcctcaccct tctctggagg ttccctggtc 1560 

■' . . ... ~4-4-s-.*-~~rT^.r-r nf frfr>rT-arf(7 1620 

1680 
1740 
1800 
1860 



60 



180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 



gtjijuciy <-yy uLyyn.*.w»w WW**— ^ 3 — 3 — - - 

agetooggcc tgggaggtct gtcctcccga acctttgggo cttctgggag cttgcccagc 
ttgagcctgg agtccccctt actaggggca ggcatctaca ccagtaataa ggaccccatc 
tcccacagtg gcgggatgct gcgggctgtc tgcagcaccc ctctctcctc cagcctcctg 
gggcccccag ggacctcggc cctgccccgc ctcagccgct ccccgttcac cagcaccctc 
ccctcctctt ctgcttctat ctccaccact caggtgtttt ctctggctgg ctctaccttt 
agcctccctt ctacccacat ctttggaacc cccatgggtg ccgttaatcc cctcctctcc 1920 
caagctgaga gcagccacac agagccagac ctggaggact gcagcttccg gtgtcggggg 1980 
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2040 
2100 



acctcoccto aggagagtct gtcttecatg tcceocatca gcagcctccc cgoactcttc 
gaccagaoag cctctgcaoc ctgtgggggc ggccagttag acccggcggc ccoagggacg 
actaacatgg agcagcttot ggagaageag ggcgacgggg aggccggcgt caacatcgtg 2160 
gagatgctga aggcgctgca cgcgotgcag aaggagaaca agcggctgca agagcagatc 2220 
ctgagcotga cggocaaaaa ggagcggctg cagattctca acgtgoagct atotgtgcco 2280 
ttccctgccc tgcctgctgc octgcctgoc gccaaeggcc otgtccctgg gccctatggc 2340 
ctgcctcccc aagccgggag cagcgactcc ttgagcacca gcaagagcoo tccgggaaag 2400 
agcagcctcg gcctggacaa ctogctgteo aettcttctg aggacccaca ctoaggctgc 2460 
ccgagccgca gcagctogtc gctgtccttc cacagcacgo ccccaccgct goccctcctc '-'Hins 
cagcagagoo ctgccactot gcccotggco ctgcctgggg cccctgcccc actcocgocc 
cagccgcaga acgggttggg ccgggcacco ggggcagcgg ggctgggggc oatgcccatg 
gctgaggggo tgttgggggg gctggeaggc agtgggggco tgcccctcaa tgggctcott 
ggggggttga atggggocgc tgcccocaae cccgcaagct tgagccaggc tggcggggcc 
ocoaogctgo agctgccagg otgtctcaac agccttacag agcageagag aoatctoctt 
cagcagcaag agcagcagot coageaactc oagcagctcc tggcctcccc gcagctgacc 
ccggaacacc agactgttgt ctaccagatg atccagcaga tccagcagaa aogggagctg 
cagcgtctgc agatggetgg gggctcccag ctgcccatgg ccagcctgct ggaaggaagc 
tccaccccgc tgctgtetgo gggtacccct ggcctgctgc ocacagcgtc tgctcoaccc 
ctgctgcccg ctggagcoct agtggctccc tcgcttggca acaaoacaag totoatggcc 
gcagcagctg cagctgeagc agtagcagoa gcaggcggac ctccagtcct cactgcccag 
accaaccoct tcctcagcct gtegggagca gagggoagtg gcggtggocc oaaaggaggg 
accgctgaca aaggagootc agccaaccag gaaaaaggct aa 

<210> 54 

<211> 2227 

<212> DNA 

<213> Homo sapiens 



2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3282 



<400> 54 

gagagcccga acaggaagag ggtacagctt tgtgcaggtc acatgcccac tgcagccctc 
oagootctgg tccccagagc ggactttgga agctgaactg ottttgttgc tggaagactt 
atgttataat ttacoctggg tggaccaggg tcgtaoaaaa gggcaaogct ccooagtccc 
cccactccog accccggaat catgcatcgg actacacgga tcaaaatcac agagctgaac 
ccccacctea tgtgtgccot otgcgggggg tacttcatcg acgccaocac tatogtggag 
tgcotgcatt ccttctgcaa aaoctgcatc gtgcgctacc tggagaccaa caaatactgc 
ccoatgtgtg aogtgcaggt ccataaaaoc cggccgctgc tgagoatcag gtotgaoaaa 
acacttcaag acattgtcta caaattggte cctgggcttt ttaaagatga gatgaaacgg 
oggogggatt tctatgcagc gtaccccctg acggaggtcc ocaaoggctc caatgaggac 
ogcggcgagg tcttggagoa ggagaagggg gotctgagtg atgatgagat tgtcagcotc 
tccatcgaat tctacgaagg tgccagggac cgggatgaga agaagggocc cctggagaat 
ggggatgggg acaaagagaa aacaggggtg cgcttcctgc gatgcccagc agccatgacc 
gtcatgcatc ttgccaagtt tctccgcaac aagatggatg tgoocagcaa gtacaaggtg 
gaggttctgt aogaggacga gocactgaag gaatactaca ccctcatgga oatogcotac 
atctacccct ggcggcggaa cgggcotctc cocotcaagt accgtgtoca gccagcotgc 
aagcggctca cectagccac ggtgcccacc ccctccgagg goaccaacac cagcggggcg 
tccgagtgtg agtoagtcag cgacaaggct cocagccctg ccaccctgcc agceacctcc 
tcctccctgc ccagcccagc caccccatoo catggctctc coagttccca tgggcotcca 
gccacccacc otacctcccc caotccccct tcgacagcca gtggggccac cacagotgcc 
aacgggggta gcttgaactg cctgcagaca ccatcctcca coagcagggg gcgcaagatg 
actgtcaacg gcgctccegt gccoccctta acttgaggcc agggacocto tcccttcttc 
cagccaagcc tctccactcc ttceactttt tctgggccct tttttccact tcttctaott 
tccccagctc ttcccacctt gggggtgggg ggcgggtttt ataaataaat atatatatat 
atgtacatag gaaaaaccaa atatacatac ttattttcta tggaccaacc agattaattt 
aaatgccaca ggaaacaaao tttatgtgtg tgtgtatgtg tggaaaatgg tgttcatttt 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
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<210> 55 

<211> 4283 

<212> DNA 

<213> Homo sapiens 



<400> 55 

ttgegggaaa gagcoaaaac otggcgttgg ggggcccggg cggggagccc ctoccgcggt 
coacagcgac gcctgcccag coctcctcoc cttocggctc cggcacgggg ccccgaggcg 
ttoggaggcc aggcgggttt otgtcaggcc cggggaggag gggcgggcgg ggcggccgct 
gcctccccgg gacgggccgt aooaogogga cggggaggac ggggcoaggg gactgcaggg 
eggetgcacc gcccgggggc ggggtgcgga gcgggccggc gggctccoog gggcggggcg 
ggagggcggg gcgtggggcg gacggaacca ccggggcggg gtgggaggta acgggacggg 
cgcgaccatg gcgcggtgag ggagcggggg tggggatcgg tccgggggag gcotgaggcc 
gctggcttgt gegctgtctc cgccgccccc ctctttogcc gccgccgccg ccgoccoggg 
catgtcgtcc aactgcacca gcacoacggc ggtggcggtg gcgocgctca gcgccagcaa 
gaccaagacc aagaagaagc atttcgtgtg ccagaaagtg aagctattcc gggccagcga 
gocgatcctc agcgtoctga tgtggggggt gaaccaoacg atcaatgagc tgagcaatgt 
toctgttcct gtoatgotaa tgccagatga cttcaaagoo taoagcaaga tcaaggtgga 
caatcatctc ttoaataagg agaacctgcc cagocgcttt aagtttaagg agtattgccc 



1620 
1680 
1740 
1800 
1860 



ttttgggggg ggtcttgtgt aatttgctgt ttttgggggt gcctggagat gaactggatg 1560 
ggccactgga gtctcaataa agctotgcac catcctcgct gtttcccaag gcaggtggtg "< 
tgttgggggc cccttoagac ccaaagcttt aggcatgatt ccaaetggot gcatatagga 
gtcagttaga attgtttott tctctccoag tttctctccc catcttgget gctgtootgc 
ctctgaocag tggccgcccc ccgcgttgtt gaatgtccag aaattgctaa gaacagtgcc 
ttttaoaaat gcagtttato cctggttctg aggagcaagt gcagggtgga ggtggoacct 
gcatcacctc ctcctcttgc agtggaaact ttgtgcaaag aatagatagt tctgcctctt 1920 
tttttttttt ttcctgtgtg tgtggccttt gcatcattta tcttgtggaa aagaagatto 1flfln 
aggccotgag aggtctoagc tcttggagga gggctaaggc tttageattg tgaagcgctg 
cacccccaoc aaccttaccc tcaccgggga accctcacta gcaggactgg tggtggagtc 
toacetgggg cctagagtgg aagtgggggt gggttaacct cacacaagca cagatcccag 
actttgccag aggcaaacag ggaattoego cgatactgac gggctccagg agtcgtcgcc 
acactcg 



1980 
2040 
2100 
2160 
2220 
2227 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



ttc 840 
900 
960 
1020 



catggtgttc cgaaaccttc gggagaggtt tggaattgat gatcaggatt accaga 
agtgacgcgc agcgcccoca tcaacagtga cagccagggt cggtgtggoa ogogtttcct 
caccacctac gaccggogct ttgtcatoaa gaotgtgtcc agcgaggacg tggcggagat 
geaeaaoatc ttaaagaaat accaccagtt tatagtggag tgtoatggca acacgctttt 
gccacagttc ctgggcatgt accgcctgac cgtggatggt gtggaaacct acatggtggt 1080 
taccaggaac gtgttcagoc atcggctoao tgtgcatogc aagtatgaoo toaagggttc 1140 
tacggttgcc agagaagcga gcgacaagga gaaggccaag gacttgccaa cattcaaaga 
caatgacttc ctcaatgaag ggcagaagot gcatgtggga gaggagagta aaaagaactt 
ectggagaaa ctgaagcggg acgttgagtt cttggcacag ctgaagatca tggactacag 
cotgctggtg ggcatccacg acgtggaocg ggcagagcag gaggagatgg aggtggagga 
gogggcagag gacgaggagt gtgagaatga tggggtgggt ggcaacctao tctgctocta 
tggcacacct ccggacagcc ctggcaaoot cctcagcttt cctcggttct ttggtcctgg 
ggaattcgac ccctctgttg acgtctatgc catgaaaago catgaaagtt oocccaagaa 
ggaggtgtat ttcatggcca tcattgatat cctcacgcca tacgatacaa agaagaaagc 
tgcacatgot gcoaaaacgg tgaaacacgg ggcaggggco gagatctcga ctgtgaaccc 
tgagcagtac tccaaacgot tcaacgagtt tatgtccaac atcctgacgt agttctottc 
taccttcagc cagagccaga gagotggata tggggtcggg gatcgggagt tagggagaag 
ggtgtatttg ggctagatgg gagggtggga gcagagtcgg gtttgggagg gctttagcaa 
tgagactgca gcctgtgaca cogaaagaga ctttagctga agaggagggg gatgtgctgt 
gtgtgcacct gctcacagga tgtaacocca ccttctgctt acccttgatt ttttctcccc 
atttgacaco caggttaaaa aggggttcoc tttttggtac cttgtaacct tttaagatac 2040 



1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
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cttggggcta gagatgactt cgtgggttta 
ccaggtttgo tatttataat catatttcat 
gctctcagtt cccttcaatt aaagagatac 
aaccaagtgc tatggatgcc agattggaga 
cttgtctgga ttaactttgt aatttatgga 
ttggattcaa gtgaaaactg ttgcattatt 
acaccagaga tctcagatoa gaatcagaga 
gatggtgagg ggcaggaaag cggctgggct 
gatgatcacc oaagecccag gctgtcttag 
gcctgcctoc oactaggtoa agaggaacta 

gtttaaatgg ttatttccot ttgggaaaae 
gtgccctcga tctattfccct tcttccttct 
cgaactctta gotctgggoa gatetccaag 
cgttaagctc cgggatgacc ttgtaggaga 
cagcaaggtg cccccatctt agagtgtggt 
tgaggatgtg atccaggaaa tcoagtttgg 
gcettggggc tgtttttcct tgttgccceg 
aggctttttt ggaaggagtg gcttcctgca 
ccaggccctc agaactgagc cacaggctgc 
ctgcattggg ggaggtaoat tcctgcatct 
tgggatgaga tatggtcaga ettgtagata 
ccattttgaa tgaatagatt ggtttcctgt 
tccgaagcag gaaccagcae tgtctctgtg 
atggagacgg cattcttgga cttcactggg 
agaggcagat gggggtcaaa ccactgcctt 
aacaactgcc gcaagaccac tacatgactt 
aaaacaaatt tgacttggga aagggattat 
acgtcataat gaagaatgga agtttggatc 
cttggaaagc tgtcttccag cctccaaacc 
agcactgtot otgtgoctga ctcacagcat 
ttggacttca ctggggctgc tggattggat 
tcaaaccact gccttggccc caggaagggg 
accactacat gacttaggga acttgaaacc 
tgggaaaggg attatgtagg aataatgttt 
atggaagttt ggatctgcto ctcgtcaggo 
tccagcagcc tccgtggcct cgggttccta 
tgttgccata atgtgtatgg aaagtgtaac 
gtatctaaot tgtttaacat tga 

<210> 56 

<211> 6140 

<212> DNA 

<213> Homo sapiens 



tttgggtttt gtttctgaaa tttoattgct 
□agcctaccc accctcccca tctttgctga 
ccagtagacc cagcacaagg gtccttccag 
ggtcagacac ctcgccctgc tgeatttgct 
gtattgtgoa caacttcctc cacotttccc 
cctccatcct gtctggaata caecaggtca 
tctcagaggg gaataagttc atcctcatgg 
cttggacacc tggttcteag agaaccctgt 
cccctggagt tcagaagtcc tctotgtaaa 
gagtaccttt ggatttatca ggaccctcat 

ttcagaaact gatgtatcaa atgaggccct 
gacotcetcc caggcactet taettctagc 
cgcctggagt gctttttagc agagaoacct 
tctgtctcae tgtgcctgga gagttacagc 
gtccaaacgt gaggtggctt cctagttaca 
aggcttgatg tgggttttga cctggcatca 
otctagactt ttagcagate tgoageccac 
ggtgttccac ctgccttcgg agcctgccac 
totggcoagg agagaaacag ctctgttgtt 
tctcaccccc tcaaccagga actggggatt 
aocccaaaga tgtgaagatc gcttgtgaaa 
ggctccctcc aaacctggcc aagcocagot 
eotgactcac agcatatagg tcaggaaaga 
gctgctggat tggatgggaa acettctgga 
ggccccagga aggggccata ggtaggtotg 
agggaacttg aaaccaactg gctcatggag 
gtaggaataa tgtttggact tgatttoocc 
tgctcctcgt caggcgcagc atctctgaag 
tggccaagcc oagcttccga agcaggaacc 
ataggtcagg aaagaatgga gaoggcattc 
gggaaacctt ctggaagagg cagatggggg 
ccataggtag gtctgaacaa ctgccgcaag 
aactggctca tggagaaaac aaatttgact 
ggacttgatt tccocacgtc ataatgaaga 
gcagcatctc tgaagcttgg aaagctgtct 
ccggcttctc tgoatttggt ctgctgatca 
acattcttac tggttaaaga cgaotaccag 



<400> 56 

gcggccgcag cctgagccag ggccccctcc 
ggggcaggtc cgggcaccca ccatgcgagg 
ggaggotgco cgggcgctga gcocccagcc 
tggatgggct gccaaaggga ccgtgcgggg 
gcatgtgtca gagccggaca ggacccagct 
catggacacg ctgccagata acaggaccag 
gtcccgtctc tatggcccca gcgagcccca 
ggccaaccgg agocaagtga agatccacac 



ctcgtcagga ccggggcagc aagcaggoog 
cgagctotgg otoctggtgc tggtgctcag 
cggagcaggt caogatgagg gcccaggotc 
otggaacogg agagcccgag agagcactgg 
gagccaggac ctgggtgggg goacectggc 
ggtggtggag gacaaccaca gctattatgt 
cagccgggaa ctgtgggtag atgtggccga 
aatactctcc aacaoccacc ggcaggcttc 
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gagagtggtc ttgtcotttg atttcccttt otacgggcat cototgcggc agatcaccat 540 

agcaaotgga ggcttoatct tcatggggga cgtgatccat cggatgotca cagctaotca 600 

gtatgtggcg cooctgatgg ccaacttcaa ccctggctac tccgacaact ccacagttgt 660 

ttactttgac aatgggaoag tctttgtggt toagtgggac caogtttatc tccaaggctg 720 

ggaagacaag ggcagtttca cottccaggc agctctgcac catgacggcc goattgtctt 780 

tgcotataaa gagatcccta tgtctgtocc ggaaatcagc toctcccagc atoctgtcaa 840 

aaccggccta toggatgcct tcatgattct caatccatcc ceggatgtgc cagaatctcg 900 

gcgaaggagc atctttgaat accaccgcat agagctggac cccagcaagg tcaccagcat 960 

gtcggccgtg gagttcaocc cattgccgac ctgcctgcag cataggagct gtgacgcctg 1020 

10 oatgtcotca gacotgacct tcaactgcag ctggtgccat gtcctccaga gatgotccag 1080 

tggctttgac cgctatcgce aggagtggat ggaetatggc tgtgcacagg aggcagaggg 1140 

caggatgtgc gaggacttcc aggatgagga ccacgactca gcctcccctg aeaottcctt 1200 

cagcocotat gatggagaec taaeeactac ctcctcctcc ctottcatcg acagcctcao 1260 

cacagaagat gacaccaagt tgaatcccta tgoaggagga gaoggccttc agaacaaoct 1320 

75 gtoccccaag aoaaagggca ctcctgtgca cctgggcacc atcgtgggca tcgtgctggc 1380 

agtcctcctc gtggcggcca tcatcetgge tggaatttac atcaatggcc acoooaoatc 1440 

caatgctgcg otcttcttca tcgagcgtag acotoaceac tggceagcca tgaagtttcg 1500 

oagccaecct gaccattoca ectatgcgga ggtggagocc togggccatg agaaggaggg 1560 

cttcatggag gotgagcagt gctgagaaca coaagtctcc ectttgaaga ctttgaggcc 1620 

acagaaaaga cagttaaagc aaagaagaga agtgactttt cctggcctct cccagcatgc 1680 

20 cctgggotga gatgagatgg tggtttatgg ctecagagct gotgttcgct tcgtcagcac 1740 

accccgaata ttgaagaggg ggccaaaaaa caaccacatg gattttttat aggaacaaca 1800 

acctaatctc atcctgtttt gatgcaaggg ttetottctg tgtcttgtaa ccatgaaaea 1860 

goagaagaac taacataact aactccattt ttgtttaagg ggoctttacc tattcetgca 1920 

octaggctag gataacttta gagcactgac ataaaacgca aaaacaggaa toatgccgtt 1980 

25 tgcaaaacta aotctgggat taaaggggaa goatgtaaac agctaactgt ttttgttaaa 2040 

gatttatagg aatgaggagg tttggctatt gtcacatgac agactgttag ccaaggacaa 2100 

agaagttctg caaacctccc otggaccctt gotggtgtec agatgtctgo ggttgtcagc 2160 

coottccttt cccocgacct aaacataaaa gacaaggcaa agocogcata attttaagac 2220 

ggttctttag gacattagtc caccatcttc ttggtttgct ggctctccga aataaagtcc 2280 

ctttccttgc tccaactcct tgtctctcaa cgtattggct atgacgcagc aagcagaatg 2340 

30 aatttggact cagttaoagg etgtcaatgg totgctctgt agcagtctca gagcotooco 2400 

gaoocaetao etggagatag ccagatagcc agatgccctg ctcctggcca ootttaaagc 2460 

coctgcatat gacacaggtt aactaaagtc aagattgggg ctgctgcatt ccaggttccc 2520 

tagactcaca agctggtcct tggccaggtg cagtggctca cgcctgtaat cccagcaott 2580 

tgggaggctg aggoaggcgg atcacctgaa gtcagaagtt tgagaooagc ctggccaaca 2640 

35 taattaaaat gtctctaota aaaatacaaa aaattagctg ggtgtggtga cgottgcctg 2700 

tatcccagct actcaggaag otgagacacg agaatcactt gaacctggga ggoagaggtt 2760 

gcagtgagct cagatagtgc cactgcaotc cagcctgggt gacagagcga gactcogtot 2820 

caaaaaaaaa aaaagaaagc agaacctoat ggotatagag ttggcatttt agcccoagct 2880 

totgtagctc tgaaagccta aagaaggtat tctctcoato tgttaaacac agtatagtgg 2940 

ctctcagccc ttggggoatg ttatcatggg agggaagtca aataagagga gagaaaagaa 3000 

otoaaggggg aaactgcatt tttaggcttt gctctcttac ottgcccttt otaotcagaa 3060 

ccaataactt ctgcatcaaa aoatgttaca gcotgcatca agggctttac cccaacctgc 3120 

agcccagcct tcoctgggtg agottgctat gogcagccao atttaccatg tggggctccc 3180 

tattctgatg goctgttcgg tgccgggttt actcactgcc ctgttctgat gtcagtgcct 3240 

gtacatacct ocaaaggcag gacttgcotg ataaatattt ttcotcctct gaactggatt 3300 

45 ttataggcat taaagacaag tcgggtggct agagggctcc ttgagaoata ootagcaggg 3360 

aactgcaggt ggattctgtt gagaggcaaa gcacctgagt ggttgggaca caggcagctg 3420 

geatgggagg gacttttttt gagacagggt ctcactgtgt cgcocagggc aaggatgccc 3480 

aaagacacca ggttggagag gcacotgcca actacttgct ttccctggag cotgcatgtg 3540 

cctgtggggt ggggaggcgt aggggtctac ggotgcctga gatgggtgtg cacagtgtgt 3600 

50 gaagtacota octocttgcc ttgctggact gtcagccagt cgcagggccg gccacaagao 3660 

ccatgtctcc atctggtcat actocatagc tacoaagtta aeotgctcta aactttggag 3720 

aaotggatct gtecaataaa cgcttatttg gccaagcctg atggotcgtg cotgtactco 3780 

cagcactttg ggaggctgag gtgggagggt tgottgagcc caggggtttg agaccagctt 3840 

gggoaacaac aaoaaaaatg ccaggtgtgg tggggtgcac ctgtagtccc agctactagg 3900 

gaggctgagc caggaggatc acttgagccc gggaggttga ggctgcagtg gggggtcata 3960 

55 atcatgccao tgtactccag cctgggtgac agagtgagao cctgtctccg aaaaaaaaaa 4020 
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4080 
4140 
4200 
4260 
4320 
4380 
4440 



aaaaaaaaga acggaaaaag aaatgottao attgteaggg atcctgtaga caatoattaa 
otctatgaga tgcttggttc tatttttttg ggagactttg tccaagtgtt ttggettaag 
aaatccatag gcctctettg gtgacacatc tctagtactt tttgtcataa acaaacaggc 
catotgccgo caaatacatc cactccccat gccactgaca tcctatgggt oagcoaggct 
tgctttgaot gaggcogagg catctggaac tttctctgcc tgcaggggct agcagcagag 
gcttcaccgc atcaccaccc ottcctccao tcctgacatt ctttcecttc agggatccaa 
aatggttggc ogagctccca gtgggaaaac gtgtgctaga gttggggagt gagatgagtg 
gtgctgtcca tggaatcagg ccaoageagg aaetgcccca ctggcoattt gagacacaca 4500 
caggtggtaa atgctctgct ggtgggctgt gcttccctca ttcagagagc tctgttacag 
occactgtgt cctttagaag cttgaaagga acccaaetct ttgctgcact gtcctttttc 
ttcctcaaat tcagaccctc cttccaccgg oaccecccta ctecacccte agctcttcct 4680 
tgcotggttt atcaagcaga gctgaggccc cacgtttcca actctgattg tcacttgcat 4740 
cttcacaaag gataaaccae ggageaactg gaaaaceatc agccaagcgt tcggatgagt 4800 
ctggttattg gtccaccccc gaocagattc ccttacactt aactcactto tttetttggo 4860 
aatgaccctc atgacatgta taaatgggta tgactaagaa gaggctgtga tctaacattt 4920 

4980 



4560 
4620 



atttgctgcc attttttaat ctggggagaa gcagccecaa ctcateactg ggaaagaact 
ccccctgcaa accagetaaa tttgataatt taaaccccct gcccctaaaa cttotcacag 5040 

5100 
5160 
5220 
5280 
5340 
5400 



agctggggag ttggtggoaa ctttecaagt caaggtcttg cttagaaagt octtcactac 
atggccaggt gcagtggctc acgoctgtag tcccaggtao ttgggagcct gaggcaggag 
gattgcttga gcteaggagt tcaaggctge agagagctat gatcatccca ctgcatttgt 
ttaaaaataa atttttaaaa tttgtgtgtt ttateagggg tctcotgtao agtgtatetg 
tgtatgtttg tgtgtgtgtt tgtatacagc cttgtttaat gttttgagca ataagatatg 
caoacaoagg tattttgttg ctaaagagat tggacaaggt tgtagetgtg ctcaggcttc 

agcttggttt gttaaattga gagataaaca atgacaagag ctgccagcca aocacaotat 5460 

tcaaaaagca aagtgttcac caetaaaget aaceatteat ctggttgcag gcaaggotaa 5520 

ggctctctct cctctagttc ctggaacaga ctcacagatt ggoatgaago actgatoagg 5580 

ggetgcaetc agactooctg gcoaagoaaa cctacaccag aagagtcagt gtcacagata 5640 

tgatgcggcc aatctctgtc tccaaaaacc tacctgaaot taatggtaga attcaaagat 5700 

ctggggactg agggcaccca gccttctaaa acacaatgta ttcatgtgtt tagtgtaaac 5760 

tctctgcatg gattctcagt gttaataata aaaggaagca ttcttttaca actcctgctg 5820 

tgtgcaaaag aaagtgcaaa ggatttggag tggcattocg aagatcacoa cacatacott 5880 

ggttctgatg gctgctgaac tccgacttct tcgctgagac atgactgtgg gaacagcctc 5940 

cagctatctg ctcatcagag gtgctttoct caacctcotg caccaoctcc aagagaaaca 6000 

gcctaaaaag aaaccccagc tgtttaotta tattggtctg taaatccctg gaagtaaacc 6060 

ccatgcattt ttatctactg tctgaggaca tacaataaat ctgagaaagt ctatgctgtc 6120 

6140 



<211> 2098 
<212> DNA 
<213> Homo sapiens 



<400> 57 

gcaggagcac gtggagaggc cgggtagoca cagcggoago tocagcccgg cccggcagcg 
acatggaaga tatacaaaca aatgcggaac tgaaaagcac tcaggagcag totgtgcccg 

cagaaagtgc agcggttttg aatgactaca gtttaaccaa atctcatgaa atggaaaatg 180 

tggacagtgg agaaggccca gccaatgaag atgaagacat aggagatgat tcaatgaaag 240 

tgaaagatga atacagtgaa agagatgaga atgttttaaa gtoagaaccc atgggaaatg 300 

cagaagagcc tgaaatccct tacagotatt caagagaata taatgaatat gaaaacatta 360 

agttggagag aoatgttgtc tcattcgata gtagcaggcc aaccagtgga aagatgaact 420 

gcgatgtgtg tggattatcc tgcatcagct tcaatgtott aatggttcat aagcgaagcc 480 

atactggtga acgcccattc cagtgtaatc agtgtggggc atcttttaot cagaaaggta 540 

acctcctccg ccacattaaa ctgcacaoag gggaaaaacc ttttaagtgt cacctctgca 600 

actatgcatg ccaaagaaga gatgcgctca cggggcatct taggaoaoat totgtggaga 660 



120 
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aaooctacaa atgtgagttt tgtggaagga gttacaagca gagaagttco cttgaggagc 720 
acaaggagcg ctgccgtaca tttottoaga gcactgaecc aggggacact gcaagtgogg 780 

900 
960 
1020 
1080 



duuyyayvy v*i<y^wjwivii 3 ^ -i -i -- - - — 

aggcaagaca catcaaagca gagatgggaa gtgaaagagc totogtactg gacagattag 
caagcaatgt ggcaaaacga aaaagotcaa tgcctcagaa attcattggt gagaagcgco 
actgctttga tgtcaactat aattcaagtt acatgtatga gaaagagagt gagctcatac 
agacccgcat gatggaccaa gccatcaata acgccateag ctatettggc gocgaagooo 
tgtgcccctt ggtccagaca ccgcctgete ooacctegga gatggttcca gttatcagca 
gcatgtatec catagccctc acccgggctg agatgtcaaa cggtgcccct caagagotgg 1140 

=" • . .-. — . -j — 4 — i 1200 

1260 



aaaggaaaag catcctcctt coagagaaga gcgtgccttc tgagagaggc ctctctcoca 
acaatagtgg ccacgactcc acggacactg acageaacca tgaagaaogc cagaatcaca 
tctatcagca aaateacatg gtcctgtctc gggcccgcaa tgggatgcca cttotgaagg 1320 
aggttccccg ctcttacgaa etoctcaage eoccgcecat ctgcccaaga gactctgtoa 
aagtgatcga caaggaaggg gaggtgatgg atgtgtatcg gtgtgaooac tgccgcgtcc 
tcttcctgga ctatgtgatg ttcacgattc acatgggctg ooacggcttc cgtgaccctt 
tcgagtgtaa catgtgtgga gatcgaagcc atgatcggta tgaattctcg tctcacatag 
ccagaggaga aoacagaage ctgctgaagt gaatatctgg tctcagggat tgctcctatg 
tattcagcat ogtttetaaa aaoagttgac otcgcctaae agattgctct caaaaoatac 
toagttccaa acttcttttc ataccatttt tagctgtgtt cacaggggta gccagagaaa 
cactgtcttc cttcagaaat tattegcagg tetagcatat tattactttt gtgaaacctt 
tgttttccca tcagggactt gaattttatg gaatttaaaa gooaaaaagg tatttggtca 
ttatcttcta oagcagtgga atgagtggtc ccggagatgt gctatatgaa acattctttc 
tgagatatat caaccaoacg tggaaaagcc tttcagtcat aeatgeaaat ccacaaagag 
gaagagotga ocagctgacc ttgctgggaa gcctcaccct totgcccttc acaggctgaa 
gggttaagat otaatctccc taatctaaat gaoagtotaa gagtaagtaa aagaaoag 

<210> 58 

<211> 2947 

<212> DNA 

<213> Homo sapiens 



1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2098 



<400> 58 

atgccaattc ctcctccccc gccaccccca ootggtccto ctcoacotcc cacatttcat 
caggcaaaca cagagcagcc caagctgagt agagatgago agcggggtcg aggcgccctc 
ttacaggaca tttgcaaagg gaccaagctg aagaaggtga ccaacattaa tgatcggagt 
gotcccatco tcgagaagcc gaaaggaagc agtggtggct atggctotgg aggagctgoc 
ctgcagocca agggaggtct cttccaagga ggagtgctga agottcgacc tgtgggagcc 
aaggatggtt cagagaacct agctggtaag ccagccctgc aaatcoocag ttctogagot 
gotgccccaa ggcctccagt atctgccgcc agcgggcgtc otcaggatga tacagaoagc 
agccgggcct cactcccaga actgccccgg atgcagagac oototttacc ggacotctct 
cggcctaata ocaccagcag tacgggcatg aagcacagct cotctgcccc tcccocacca 
cccccagggo ggcgtgccaa cgcacccccc acacctctgc otatgcacag cagcaaagcc 
oocgcotaca aoagagagaa acccttgcca ccgacgcctg gacaaaggct toaccctggt 
cgagagggao ctcctgctcc acccccagtc aaaccacctc cttcccctgt gaatatcaga 
acaggaccaa gtggcoagtc totggctcct cctcctccgc cttaocgcoa gcctcctggg 
gtcoccaatg gaccctctag coccactaat gagtcagccc ctgagctgcc acagagacac 
aattctttgc ataggaagac accagggcct gtcagaggcc tagoacctoc tccacccacc 
tcggcctccc catctttact gagtaatagg ccacctcccc oagcccgaga ccctoccagt 
cggggagcag ctoctccacc cccaccacct gtgatccgaa atggtgccag ggatgctocc 
cctcccccac oaccataccg aatgcatggg toagaacccc cgagccgagg aaagcoooea 
octccaccct caaggacgcc agotgggcca ccccctccte ctocaccgcc cctgaggaat 
ggocacagag attctatcac cactgtccgg tctttcttgg atgattttga gtcaaagtat 
tccttccatc cagtagaaga otttcctgct coagaagaat ataaacactt tcagaggata 
tatcccagca aaacaaaccg agctgcccgt ggagccccac ctctgccacc cattctcagg 
tgaagcctgg cttggtcccg ttcctcagga aaaggatgga ccttctcttc ttctcagatg 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
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gtocottcca ttcccctgaa aoctgcatga 
aagccctaga ctccaaatgt cotoccagct 
ttggtgatca gactctatat tgacagtagg 
gcaagccctg ctagccacat gaggaacaag 
aaggtgcctt gttgtgatga attaactcac 
tccttctcct gtccactgtg ggggaagett 
ggctggcatg accaggactt atgggtggga 
gtttgccaag aagtgatctg ttttaaaggt 
ctgctttatt tttgggggta ttttgttttt 
ccccagggat aaattggata taaacactaa 
aaaagaaagg gtgaaataaa ctgaagacca 
gggaacagga gccatttgaa eootctggga 
otgagggatg tttttcctcc eocttacogo 
tggagggcat cattcattcc tgattcaeaa 
gatagggcgt gggcctgggc cttaaeetca 
ggccctgaag ttgtcagtgg ctetttctgt 
acaaagaagg gaaggttgaa goccctcatg 
acttttaagt cctgcetgta etgaagttca 
ttcctttctc taccaocctt gccttcccag 
aagggtgaag aagtgagcag aggcttatga 
gcacaaagac tttgttgaga tttgcetcag 
aggttgttto tttgtctteo gggtcctaaa 
aggtaggaag ctgattggat gaggacttet 
gctccacaca ccagatgctt tggttttcta 
agcgtggcat gagagcaagg agaccatgge 
ttaaaaattt aatcacgaga ttgcgccaot 
cegtcte 

<210> 59 

<211> 784 

<212> DNA 

<213> Homo sapiens 



gagctcctaa eatgtttcto caatgcaatc 
eacctccatc tatgoatcto atetotggat 
atctcaaacc ctgcatccat ccttcctcca 
tttccgtgtc ttctgccttc ctcttgggga 
tgttagggca gggtggagaa tggtaotcct 
ggcaggtata ttatatttca tcatttagga 
ggggagcatt tttagtgaag caagaaagga 
catatttgga gaaagggcaa ggaattgggt 
gttotcacct gctgoccccc caccccacca 
atactaatca gttgaactta acatttaata 
ttttagaact agtoagttct ctgcagoaaa 
cccctcaccc caotgcttca gggtgctagg 
coatgooctt gaaagaaaag tcactttttg 
accccaaaaa cototggtgg gagataggaa 
atcttgtgtc tgcotoagtc ttttctgact 
ccttcagccc ctggaaggtg ctccaggata 
gaaggagctg gctttgtggg gctgoaaagg 
oagoccaoct gactgagcag actottoctg 
gactgcacgg tttaaoaoag cagagtaoag 
agatattcag atactcttct atgceaggaa 
ttoagtagat ottoottggc ageeagccat 
gagcacagag aaaatggagg tcoccagtct 
ttttttccga cagcaggatg gggotottgg 
caactgttgc tatgtgtaga gggtgotaag 
tactctttga aatggatggg gaaaattagc 
gcactccagc ctgggegaca gagcoagact 



<400> 59 

gagcggttgc gcagtgaagg 
tcctagtaca ccgcaatcat 
aaggggaaga actgtgtggc 
gtgaccacgg acttccagaa 
gggctcgcca ctgacgtcca 
gagttgaagg aaggtcggca 
ttgtatgaga aacggtttgg 
aagaccttta agcocttcat 
gactttgtgg tcagtggcac 
gagcccaaca tggatccgga 
gtggaccggg atgcagtgto 
atcacoacca ggacactgaa 
tttctttttt tgaaataaaa 
aaaa 



ctagacccgg tttactggaa 
gtctattatg tcctataacg 
catcgctgca gacaggcgct 
gatctttccc atgggtgacc 
gacagttgcc cagcgcctca 
gatcaaacct tatacootca 
cccttactac aotgagccag 
ttgctctcta gacctcatcg 
ctgcgccgaa caaatgtaog 
tcacotgttt gaaaccatct 
aggcatggga gtcattgtcc 
ggcccgaatg gactaaocct 
tagcotgtct ttcaaaaaaa 



ttgctctggc gatcgagggg 
gaggggccgt catggccatg 
tcgggatcca ggcccagatg 
ggctgtacat cggtctggcc 
agttccggct gaacctgtat 
tgagcatggt ggocaacctc 
tcattgccgg gttggacccg 
gctgccccat ggtgactgat 
gaatgtgtga gtccctctgg 
cccaagccat gctgaatgct 
aoatcatcga gaaggacaaa 
gttcccagag cocacttttt 
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<210> 60 

<211> 3033 

<212> DNA 

<213> Homo sapiens 



<400> 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 



atactcctaa gotootccoe eggcggcgag coagggagaa aggatggccg gcctggcggc 
gcggttggtc ctgctagctg gggcagcggc gctggegagc ggctcccagg gcgacogtga 
gccggtgtao cgogactgcg tactgcagtg cgaagagcag aactgototg ggggogetct 
gaatcacttc cgctcocgcc agccaatcta catgagtcta gcaggctgga cctgtcggga 
cgactgtaag tatgagtgta tgtgggtcac cgttgggctc taectooagg aaggteaoaa 
agtgcctcag ttaoatggca agtggecctt ctcocggttc ctgttctttc aagagccggc 
atcggccgtg gcctcgtttc tcaatggcct ggceagootg gtgatgctct gccgotaccg 
cacottogtg coagcctcct ccccoatgta ccacaoetgt gtggccttcg cctgggtgtc 
cotcaatgca tggttctggt coacagtctt ecacaooagg gacactgacc tcacagagaa 
aatggactac ttctgtgcct ccactgtcat cctacactca atctaoctgt gctgcgteag 
gtgagcctge ctgggtggot goaggggcaa aatcgaaccc tgggggoaga aaggggtcac 660 
ccagccttcc cctgggggcc ttcttcacta gtctcocaae acetacgcce cccaaccccc 720 
aaeaeatoag ctgteetggg tgaggactct ggggtaggac tgggggccct ggctcctgac 7 AO 
aaggagctgt agcaottget gcceagctgt ggcotgtttg gtggggagag gggtagtgao 
ttcaggggcc atgoaccaat gttgggggga ggagatgott cagggaatgc tgctctgggg 
atgggccacc tgocctctga geaaoeetgg acggtggggc aggaccgtgg ggctgcagca 
cccagctgtg gtcagtgcct tccgggctct cctgctgctc atgctgaccg tgcacgtcto 
ctacctgage ctoatocgot tcgactatgg ctacaacctg gtggccaacg tggctattgg 
cctggtcaac gtggtgtggt ggctggcctg gtgectgtgg aacoagcggc ggctgcctca 
cgtgcgcaag tgcgtggtgg tggtcttgct gctgcagggg ctgtccctgc tcgagctgot 
tgacttccca ccgctcttct gggtcctgga tgcccatgcc atctggcaca toagcaccat 
ecctgtccac gtcctctttt tcagctttct ggaagatgac agcctgtacc tgctgaagga 
atcagaggac aagttcaagc tggactgaag accttggagc gagtctgocc cagtggggat 
cctgcoccog ecctgctggc ctcccttctc ccctcaaeco ttgagatgat tttctctttt 
caaottcttg aaottggaca tgaaggatgt gggcccagaa tcatgtggcc agcocacocc 

ctgttggccc tcaooagcct tggagtctgt tctagggaag gcctoocagc atctgggaot 

cgagagtggg cagcccctct acctootgga gctgaactgg ggtggaactg agtgtgttot 1620 

" ~ — 1680 

1740 



780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



cgagagtggg cagcccctct «uui.uu^ a ^u^oawvyy 

tagctctacc gggaggacag ctgcctgttt cctccccacc agcctcctcc ccacatcccc 
agctgcctgg ctgggtcctg aagccctctg tctacotggg agaooaggga ccacaggcct 
tagggataca gggggtcccc ttctgttacc accccccacc ctcctccagg acaccactag 1800 

1920 
1980 



tagggataca yssyyi-w^^ *-a *-**wv- «v^v-~v-~ == = 

gtggtgctgg atgcttgttc tttggccagc caaggttcac ggcgattctc cccatgggat 
cttgagggac caagctgctg ggattgggaa ggagtttcac cctgaocgtt gccctagcca 
ggttcccagg aggcctcacc ataotccctt tcagggccag ggctccagca agcccagggc 
aaggatcctg tgctgctgtc tggttgagag cctgccaccg tgtgtcggga gtgtgggcoa 2040 

. — J — - — *■ — — — 2100 

2160 
2220 
2280 



aaggatcctg ujut^u^^ v-»- ^ ww»w a — ^ — t> ^ 3-3-^^3 

ggctgagtgc ataggtgaca gggccgtgag catgggcctg ggtgtgtgtg agctcaggcc 
taggtgcgca gtgtggagac gggtgttgtc ggggaagagg tgtggottca aagtgtgtgt 
gtgoaggggg tgggtgtgtt agegtgggtt aggggaacgt gtgtgcgcgt gctggtgggc 
atgtgagatg agtgactgcc ggtgaatgtg tccaoagttg agaggttgga gcaggatgag 
ggaatcctgt caeca tcaat aatcacttgt ggagcgccag ctctgcccaa gacgccacct 2340 
gggcggacag ccaggagctc tccatggcca ggctgcctgt gtgcatgttc cctgtctggt 2400 
gcccctttgc ccgcctcctg caaacctcac agggtcccca cacaacagtg ccctccagaa 2460 
gcagcccctc ggaggcagag gaaggaaaat ggggatggct ggggctctct ccatcctcct 
tttctccttg ccttcgcatg gctggccttc ccctccaaaa cctccattcc cctgctgcca 
gcccctttgc catagectga ttttggggag gaggaagggg cgatttgagg gagaagggga 
gaaagcttat ggctgggtct ggtttcttcc cttcccagag ggtcttactg ttccagggtg 
gccccagggc aggcaggggc cacactatgc ctgcgccctg gtaaaggtga cccctgccat 
ttaccagcag ccctggcatg ttcctgcccc acaggaatag aatggaggga gctccagaaa 
ctttccatcc caaaggcagt ctccgtggtt gaagcagact ggatttttgc tctgcccctg 



2520 
2580 
2640 
2700 
2760 
2820 
2880 
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aceocttgtc cctctttgag ggaggggagc tatgctagga ctcoaaccto agggaotcgg 
gtggcctgcg ctagottctt ttgatactga aaacttttaa ggtgggaggg tggcaaggga 
tgtgcttaat aaatcaattc caagccteac ctg 

<210> 61 

<211> 1174 

<212> DNA 

<213> Homo sapiens 



<400> 61 

aagctcctcc cecggcggcg agccagggag aaaggatggc cggcctggcg gcgcggttgg 
tcctgctagc tggggcagcg gegetggcga goggotocca gggcgacogt gagccggtgt 
aeegogactg cgtactgcag tgcgaagagc agaactgetc tgggggogct ctgaatcact 
tccgotoccg ccageoaato tacatgagtc tagcaggctg gacctgtcgg gaogaotgta 



2940 
3000 
3033 



120 
180 
240 



tCCyCtGUUg UUayuuaabU tawawyayt-w w*»y ^ w S -as => 9 ^ 

agtatgagtg tatgtgggtc accgttgggc tetaootcca ggaaggtcac aaagtgcctc 300 
agttocatgg caagtggccc ttctcccggt tcctgttctt tcaagagecg gcatcggccg ^« 
tggactcgtt tctoaatggc otggeoagcc tggtgatgct otgocgctac cgcaocttcg 
tgccageotc ctcocccatg taccacacct gtgtggcctt cgeetgggtg tccoteaatg 
eatggttctg gtcoaoagtc ttecacaeca gggacactga cctcacagag aaaatggact 
acttctgtgc ctccactgtc atcctacact caatctacot gtgotgogtc aggaccgtgg 
ggctgcagca ceeagctgtg gtcagtgcct tccgggctct cctgctgctc atgctgaceg 
tgcaegtctc ctacctgage ctcatoogct tcgactatgg ctacaacctg gtggccaacg 
tggctattgg octggtcaac gtggtgtggt ggctggcctg gtgcctgtgg aacoagcggc 
ggctgcctca cgtgogcaag tgcgtggtgg tggtcttgct gctgcagggg ctgtceetgc 
tcgagctgct tgacttccca ccgctcttct gggtcctgga tgcccatgcc atctggcaca 
tcagcaccat ccctgtccac gtcctctttt tcagctttct ggaagatgac agcctgtaco 
tgctgaagga atcagaggac aagttcaagc tggttgaagc agactggatt tttgctctgc 
ccctgacccc ttgtccctct ttgagggagg ggagctatgc taggaotcca acctcaggga 
ctcgggtggc ctgogctagc ttcttttgat actgaaaact tttaaggtgg gagggtggca 



360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 

cujiiyytyyu o t-y ^ ^- v^w^'w^^'^^w ~w — . ^ => ~ j --i -> J J 1140 

agggatgtgc ttaataaatc aattocaagc ctca Yi.ll 
<210> 62 
<211> 3167 
<212> DNA 
<213> Homo sapiens 



<400> 62 

aagotoctcc cecggcggcg agccagggag aaaggatggc cggcctggcg gcgcggttgg 
tcctgctagc tggggcagcg gegetggcga gcggctccca gggcgaccgt gagccggtgt 



accgcgactg cgtactgcag tgcgaagagc agaactgetc tgggggogct ctgaatcact 
tccgotcccg ccagecaatc tacatgagtc tagcaggctg gacctgtcgg gacgactgta 
agtatgagtg tatgtgggtc accgttgggc tctacctcca ggaaggtcac aaagtgcctc 
agttocatgg caagtggccc ttctcccggt tcctgttctt tcaagagecg gcatcggccg 
tggcotcgtt tctcaatggc ctggccagcc tggtgatgct ctgccgctac cgcaccttcg 
tgccagcctc ctcccccatg taccacacct gtgtggcctt cgcotggatg agaaaactga 
ggcacagcaa ggctaaataa cttgcccaag gacacacagg aaatgeagag ccaggaactg 
aaccctggca gtctggctgt agggcttgea ttcttaatga taccactacc tcccaaatct 600 
gaggaaaggg tgtcectcaa tgcatggttc tggtccacag tcttccacac cagggacact 660 



180 
240 
300 
360 
420 
480 
540 
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1320 
1380 
1440 
1500 



gacctcacag agaaaatgga ctacttctgt gootccactg tcatcctaca ctcaatotac 720 

otgtgctgcg tcaggtgagc ctgcctgggt ggctgcaggg gcaaaatcga accctggggg 780 

cagaaagggg tcacccagcc ttcccctggg ggccttctte aotagtctcc caacaootac 840 

goococcaao ccceaacaca tcagctgtcc tgggtgagga ctctggggta ggactggggg 900 

cectggctcc tgacaaggag ctgtagcact tgotgeooag ctgtggcctg tttggtgggg 960 

agaggggtag tgaottcagg ggccatgcac caatgttggg gggaggagat gcttoaggga 1020 

atgctgctct ggggatgggo cacctgccct ctgagcaacc ctggacggtg gggcaggacc 1080 

gtggggctgc agcacccagc tgtggtoagt gccttccggg ctctcctgct gctcatgctg 1140 

accgtgcacg tctootaoct gagectcatc cgottcgact atggctacaa cctggtggoc 1200 

aacgtggcta ttggcctggt caaogtggtg tggtggctgg cetggtgcct gtggaaccag 1260 
cggeggctgc ctcaogtgcg caagtgcgtg gtggfcggfcct tgctgctgca ggggctgtoo 
ctgctcgagc tgottgaott cecaccgotc ttetgggtcc tggatgccca tgccatctgg 
cacatcagca ccatccctgt ccacgtccte tttttcagct ttotggaaga tgacagcctg 
tacctgctga aggaatoaga ggacaagttc aagctggact gaagaccttg gagcgagtct 

gcoocagtgg ggatcctgcc oocgecotgc tggootcect tetcccctca acccttgaga 1560 

tgattttctc ttttcaactt ottgaaottg gacatgaagg atgtgggccc agaatcatgt 1620 

ggocagccca coooctgttg gooctcacca gocttggagt ctgttctagg gaaggcctcc 1680 

cagcatctgg gaotcgagag tgggcagecc etetacctcc tggagctgaa ctggggtgga 1740 

actgagtgtg ttottagotc taocgggagg acagctgect gtttcctccc caccagcctc 1800 

ctoccoacat ccccagctgc ctggctgggt cctgaagcco tctgtctace tgggagacca 1860 

gggaccacag gcottaggga tacagggggt ccccttctgt tacoaccccc caccotoctc 1920 

caggacaoca ctaggtggtg etggatgctt gttctttggc cagooaaggt tcacggcgat 1980 

totccceatg ggatcttgag ggaccaagct gctgggattg ggaaggagtt teacoetgac 2040 

cgttgcccta gccaggttcc caggaggeet caeca tactc cctttcaggg ccagggctcc 0-1 nn 
agcaagccca gggcaaggat cctgtgctgc tgtctggttg agagectgee accgtgtgtc 
gggagtgtgg gecaggctga gtgcataggt gaeagggecg tgagcatggg cctgggtgtg 
tgtgagctca ggcctaggtg egcagtgtgg agacgggtgt tgtcggggaa gaggtgtggc 
ttcaaagtgt gtgtgtgcag ggggtgggtg tgttagcgtg ggttagggga aegtgtgtgc 
gcgtgctggt gggcatgtga gatgagtgac tgccggtgaa tgtgtccaca gttgagaggt 

tggagcagga tgagggaatc ctgtcaccat caataatcac ttgtggagcg ccagctctgc 2460 

ccaagacgcc acctgggegg acagecagga gctctccatg gccaggctgc ctgtgtgcat 2520 

gttccctgtc tggtgcccct ttgcccgcct cctgcaaacc tcacagggtc cccacacaac 2580 

agtgccctcc agaagcagee cctcggaggc agaggaagga aaatggggat ggctggggct 2640 

ctctccatcc tccttttctc cttgccttcg catggetggc cttcccctcc aaaacctcca 2700 

ttcccctgct gccagcccct ttgecatage ctgattttgg ggaggaggaa ggggcgattt 2760 

gagggagaag gggagaaagc ttatggctgg gtctggtttc ttcccttccc agagggtctt 2820 

aotgttccag ggtggcccca gggcaggcag gggccacact atgcctgcgc cctggtaaag 2880 

gtgacccctg ccatttacca gcagccctgg catgttcctg ccccacagga atagaatgga 2940 

gggagctcca gaaactttcc atcccaaagg cagtctccgt ggttgaagca gactggattt 3000 

ttgctctgcc cctgacccct tgtccctctt tgagggaggg gagctatget aggactccaa 3060 

cctcagggac tcgggtggcc tgegctaget tcttttgata ctgaaaactt ttaaggtggg 3120 

agggtggcaa gggatgtgct taataaatca attccaagcc tcacctg 3167 

<210> 63 

<211> 2733 

<212> DNA 

<213> Homo sapiens 



2100 
2160 
2220 
2280 
2340 
2400 
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<220> 

<221> misc_feature 

<222> (2694) . . (2694) 

<223> n=a, o, g or t 



10 <220> 

<221> mise_feature 

<222> (2724) . . (2724) 

15 

<223> n=a, c, g or t 



<400> 63 

agggagaaag gatggccggc otggcggcgc ggttggtoct gotagctggg goagcggcgo 60 

20 tggcgagcgg ctcccagggc gaccgtgagc cggtgtaccg cgaotgogta ctgcagtgeg 120 

aagagcagaa ctgctctggg ggcgctotga atcacttcog ctccogccag ccaatctaca 180 

tgagtotago aggctggacc tgtcgggacg actgtaagta tgagtgtatg tgggtoaccg 240 

ttgggctota cctccaggaa ggtcacaaag tgcctcagtt ccatggcaag tggcccttct 300 

cccggttecfc gttctttcaa gagccggcat eggccgtggc otegtttcte aatggcctgg 360 

25 ccagcetggt gatgctctgc cgctaccgca octtcgtgcc agcctcctco cccatgtaee 420 

aeacctgtgt ggcettegce tgggtgtcca teaatgcatg gttctggtcc acagtcttcc 480 

acaccaggga cactgaccta cagagaaaat ggactacttc tgtgcctcct gtatootaca 540 

ctcaatctao ctgtgctgcg tcaggacogt ggggctgcag cacccagctg tggtcaagtg 600 

ccttccgggo tctcctgctg ctcatgctga cogtgcacgt ctcctacctg agcctcatcc 660 

gcttcgacta tggctacaac ctggtggcca acgtggctat tggcctggtc aacgtggtgt 720 

30 ggtggctggc otggtgootg tggaaccagc ggcggctgcc tcacgtgcgc aagtgcgtgg 780 

tggtggtctt gctgctgcag gggctgtccc tgctcgagct gcttgactto ccaccgctct 840 

tctgggtcot ggatgcccat gccatctggc acatcagcac catccctgtc cacgtcctct 900 

ttttoagott tctggaagat gacagcctgt acctgctgaa ggaatoagag gacaagttca 960 

agctggactg agaccttgga gcgaagtctg ccccagtggg gatcotgccc ccgccctgct 1020 

35 ggcctccott ctoccctoaa cccttgagat gattttotot tttcaacttc ttgaacttgg 1080 

acatgaagga tgtgggccca gaatcatgtg gccagcccao cccctgttgg ccctcaccag 1140 

cottggagtc tgttctaggg aaggcctccc agcatctggg actcgagagt gggcagcccc 1200 

tctaoctoct ggactgaact ggggtggaac tgagtgtgtt cttagctcta oogggaggao 1260 

agctgcotgt ttcctecoca ccagcctcct ccccacatcc ccagctgcct ggctgggtcc 1320 

40 tgaagccctc tgtctaootg ggagaccagg gtaccacagg ccttagggat aoagggggtc 1380 

coottctgtt accacccccc accctcctcc aggacaccac taggtggtgc tggatgcttg 1440 

ttctttggoc agcoaaggtt cacggegatt ctccceatgg gatcttgagg gaccaagctg 1500 

ctgggattgg gaaggagttt caccctgacc gttgccctag ccaggttccc aggaggcctc 1560 

accatactcc ctttoagggc cagggctoca gcaagcccag ggcaaggatc ctgtgctgct 1620 

gtctggttga gagcctgcca ccgtgtgtcg ggagtgtggg ccaggctgag tgoataggtg 1680 

45 acagggccgt gagoatgggc ctgggtgtgt gtgagctcag gcctaggtgo gcagtgtgga 1740 

gacgggtgtt gtcggggaag aggtgtggct tcaaagtgtg tgtgtgoagg gggtgggtgt 1800 

gttagogtgg gttaggggaa cgtgtgtgcg cgtgctggtg ggcatgtgag atgagtgact 1860 

gccggtgaat gtgtccaoag ttgagaggtt ggagoaggat gagggaatcc tgtcaooatc 1920 

aataatcaot tgtggagcgc cagctctgcc caagacgcca cctgggcgga cagccaggag 1980 

50 ctctccatgg ccaggctgcc tgtgtgcatg ttccctgtct ggtgcccctt tgccogcctc 2040 

ctgcaaacct cacagggtco ccacacaaca gtgccctcca gaagcagccc otcggaggca 2100 

gaggaaggaa aatggggatg gctggggctc tctccatcct ccttttctoc ttgccttcgc 2160 

atggotggcc ttcccctcca aaacctcoat tcccctgctg ccagcccctt tgccatagoc 2220 

tgattttggg gaggaggaag gggcgatttg agggagaagg ggagaaagct tatggctggg 2280 

tctggtttct tcccttocca gagggtotta ctgttccagg gtggocccag gcagcagggc 2340 

55 cacactatgc ctgcgccctg gtaaaggtga cccctgccat ttaccagcag ccotggcatg 2400 
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ttcctgcccc acaggaatag aatggaggga gctccagaaa otttccatcc caaaggcagt 
etcogtggtt gaagoagact ggatttttgc tetgcccctg accccttgtc cctctttgag 
ggaggggagc tatgctagga ctcoaacctc agggactcgg gtggcctgcg ctagcttctt 
ttgatactga aaaottttaa ggtgggaggg tggeaaggga tgtgcttaag cggccgcgaa 
ttcaaaaago ttctcgagag tacttctaga gcggccgcgg gcccatcgat tttnccaccc 2700 



2460 
2520 
2580 
2640 



2733 



gggtggggta cccaggtaag tgtnccecat ate 
<210> 64 
<2H> 2546 
<212> DNA 
<213> Homo sapiens 



<400> 64 

aagctcctcc caeggeggeg agecagggag aaaggatggo cggootggcg gogcggttgg 60 

tcctgctagc tggggeagog gcgetggcga geggotccca gggcgaccgt gagccggtgt 120 

accgcgactg cgtactgcag tgegaagage agaactgetc tgggggeget ctgaatcact 180 



tcogotcccg coagooaate fcaeatgagtc tagcaggctg gaectgtegg gacgactgta 
agtatgagtg tatgtgggto accgttgggc tctaoctcoa ggaaggtoac aaagtgcctc 
agttccatgg caagtggccc ttetcccggt tcctgttctt teaagagecg geateggceg 
tggcctcgtt tctcaatggc ctggecagcc tggtgatgct ctgccgctac cgcaccttcg 
tgccagcctc ctcccccatg taccacacct gtgtggcctt cgoetgggtg tccctcaatg 
catggttctg gtooacagtc ttccacacca gggacactga cctcacagag aaaatggact 540 
acttctgtgc otccactgtc atoctacact caatctacot gtgctgcgtc aggcctggtc 600 
aacgtggtgt ggtggctggc ctggtgcctg tggaaccagc ggcggctgcc teaegtgogo 660 

720 
780 



240 
300 
360 
420 
480 



aagtgcgtgg tggtggtctt getgetgeag gggctgtccc tgetcgaget gcttgacttc 
ccaccgctct tctgggtcct ggatgeccat gecatotgge acatcagcao catcootgtc 

cacgtcctct ttttcagctt tctggaagat gaeagectgt acctgetgaa ggaatcagag 840 

gaoaagttoa agotggactg aagaccttgg agegagtotg ccccagtggg gatcotgeoc 900 

ccgccctgct ggoctcoctt ctcccctcaa cccttgagat gattttctct tttcaacttc 960 

ttgaacttgg aoatgaagga tgtgggccca gaatcatgtg gooagoccac cccctgttgg 1020 

cootcaccag ccttggagtc tgttctaggg aaggcctccc agcatctggg actcgagagt 1080 

gggcagcccc totacctoot ggagctgaac tggggtggaa otgagtgtgt tcttagctot 1140 

acegggagga cagctgeotg tttcctcoco acoagcctcc toocoacatc cccagotgoe 1200 

tggctgggtc otgaagcoct ctgtctacct gggagaccag ggaccaoagg ccttagggat 1260 

acagggggtc eccttctgtt accacccccc aecotoctoo aggacaccac taggtggtgc 1320 

tggatgottg ttctttggcc agecaaggtt caoggegatt ctocooatgg gatcttgagg 1380 

gaccaagctg otgggattgg gaaggagttt caoootgacc gttgecctag ccaggttccc 1440 

aggaggcotc accatactcc ctttcagggc cagggctcca gcaagcccag ggcaaggatc 1500 

ctgtgctgct gtctggttga gagootgeca ccgtgtgtcg ggagtgtggg ccaggotgag 1560 

tgcataggtg acagggcegt gagcatgggc ctgggtgtgt gtgagctcag gectaggtgo 1620 

gcagtgtgga gaogggtgtt gtoggggaag aggtgtggct tcaaagtgtg tgtgtgcagg 1680 

gggtgggtgt gttagcgtgg gttaggggaa cgtgtgtgcg ogtgctggtg ggcatgtgag 1740 

atgagtgact googgtgaat gtgtccacag ttgagaggtt ggagcaggat gagggaatcc 1800 

tgtcaccatc aataatcact tgtggagcgo oagctctgcc eaagacgeca cctgggcgga 1860 

cagecaggag ctctccatgg ccaggctgcc tgtgtgcatg ttocotgtct ggtgoccctt 1920 

tgcccgcctc ctgcaaacct cacagggtcc ccacacaaca gtgccctcca gaagcagccc 1980 

cteggaggea gaggaaggaa aatggggatg gctggggcto totooatcct ccttttotoo 2040 

ttgccttcgc atggotggcc tteocctcca aaacctccat tcccctgctg ccagcccctt 2100 

tgccatagcc tgattttggg gaggaggaag gggcgatttg agggagaagg ggagaaagct 2160 

tatggctggg tctggtttct tcccttccca gagggtctta ctgttccagg gtggcoccag 2220 

ggcaggcagg ggccacacta tgootgegee ctggtaaagg tgacccctgc catttaccag 2280 
oagccctggc atgttcctgc cccacaggaa tagaatggag ggagctccag aaactttooa 



2340 



tcccaaaggc agtctccgtg gttgaagcag actggatttt tgotctgccc ctgacccctt 2400 
gtocctcttt gagggagggg agetatgeta ggactccaac ctoagggact cgggtggcct 2460 
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gcgctagctt cttttgatac tgaaaacttt taaggtggga gggtggcaag ggatgtgctt 
aataaatcaa ttccaagcct cacctg 

<210> 65 

<211> 2683 

<212> DNA 

<213> Homo sapiens 



<400> 65 

aagctcctcc cccggcggcg agccagggag 
tcctgctagc tggggcagcg gcgctggcga 
accgcgactg egtactgcag tgcgaagagc 
tocgctcecg ccagccaatc tacatgagtc 
agtatgagtg tatgtgggtc accgttgggc 
agttccatgg caagtggccc ttctcoeggt 
tggcctcgtt tctcaatggc ctggccagcc 
tgccagcctc oteccceatg taccacacct 
catggttctg gtccacagtc tteeacacca 
acttctgtgc ctccactgtc atcctacact 
ggctgcagca cccagetgtg gtoagtgcct 
tgcacgtctc ctacctgagc ctcatccgct 
tggctattgg cctggtaaae gtggtgtggt 
ggctgcctca cgtgcgcaag tgogtggtgg 
tcgagctgct tgacttecca ccgctettct 
tcagcaccat ccctgtccac gtcctctttt 
tgctgaagga atcagaggac aagttcaagc 
cagtggggat cctgcccocg cootgctgge 
tttctctttt caacttcttg aacttggaca 
agcccacccc ctgttggccc tcaccagcct 
atctgggact cgagagtggg cagcccctct 
agtgtgttct tagctotacc gggaggacag 
ccacatcccc agotgcctgg ctgggtcctg 
ocaoaggcct tagggataca gggggtccco 
acaccactag gtggtgctgg atgcttgttc 
occatgggat cttgagggac caagctgctg 
gccctagcca ggttcccagg aggcctcaoc 
agcccagggc aaggatcctg tgctgctgto 
gtgtgggcca ggctgagtgc ataggtgaca 
agctcaggcc taggtgcgca gtgtggagao 
aagtgtgtgt gtgcaggggg tgggtgtgtt 
getggtgggo atgtgagatg agtgaotgcc 
gcaggatgag ggaatcctgt caocatcaat 
gaogcoacct gggcggacag ccaggagctc 
cctgtctggt gcccctttgc cogcotcctg 
ccctccagaa gcagcccctc ggaggcagag 
ccatcctcct tttctccttg ccttcgcatg 
cctgctgcca gcccctttgc catagcctga 
gagaagggga gaaagcttat ggctgggtct 
ttccagggtg gccccagggc aggcaggggc 
occctgccat ttaccagcag ccctggcatg 
gctccagaaa ctttccatcc caaaggcagt 
tctgeccctg accccttgtc cctctttgag 
agggactcgg gtggcctgcg ctagcttctt 
tggcaaggga tgtgcttaat aaatcaattc 



aaaggatggc cggcctggcg gcgcggttgg 
gcggctccca gggogaocgt gagccggtgt 
agaactgctc tgggggcgct ctgaatcact 
tagcaggctg gacctgtcgg gacgactgta 
tctacctooa ggaaggtcac aaagtgcctc 
tectgttctt toaagagccg gcatcggccg 
tggtgatgct ctgacgctac cgcaccttcg 
gtgtggcctt cgcctgggtg tccctcaatg 
gggaoaotga cctcacagag aaaatggact 
eaatctacct gtgctgcgtc aggaccgtgg 
tccgggctct cctgctgctc atgctgaccg 
tcgactatgg etacaaoctg gtggoeaacg 
ggctggcctg gtgcctgtgg aaccagcggc 
tggtcttgot gotgcagggg ctgtccctgc 
gggtcctgga tgcccatgcc atctggcaca 
tcagctttct ggaagatgac agcctgtacc 
tggactgaag accttggagc gagtctgccc 
otcccttctc ccctcaaccc ttgagatgat 
tgaaggatgt gggcccagaa tcatgtggcc 
tggagtctgt tctagggaag gcctcccagc 
acctcctgga gctgaactgg ggtggaactg 
ctgcctgttt cctocccacc agcctcotcc 
aagccctctg tctacctggg agaccaggga 
ttotgttacc aooccccacc ctcotccagg 
tttggccagc caaggttcac ggcgattotc 
ggattgggaa ggagtttcac cctgaccgtt 
atactccctt tcagggccag ggctooagca 
tggttgagag cctgccaccg tgtgtcggga 
gggccgtgag catgggcotg ggtgtgtgtg 
gggtgttgtc ggggaagagg tgtggcttca 
agcgtgggtt aggggaacgt gtgtgogogt 
ggtgaatgtg tccacagttg agaggttgga 
aatcacttgt ggagcgccag ctotgcccaa 
tccatggcca ggctgcctgt gtgoatgtto 
caaacctcac agggtcccca cacaacagtg 
gaaggaaaat ggggatggct ggggctctct 
gctggccttc ccctccaaaa cctccattcc 
ttttggggag gaggaagggg cgatttgagg 
ggtttcttcc cttcccagag ggtcttactg 
cacactatgc ctgcgccctg gtaaaggtga 
ttcctgcccc acaggaatag aatggaggga 
ctocgtggtt gaagcagact ggatttttgc 
ggaggggagc tatgctagga ctccaacctc 
ttgatactga aaacttttaa ggtgggaggg 
caagcctcac ctg 
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<210> 66 

<211> 2341 

<212> DNA 

<213> Homo sapiens 



<400> 66 

aagctcctcc cccggcggcg agccagggag aaaggatggc cggcctggcg gcgcggttgg 
tcotgctagc tggggcagcg gcgctggcga gcggctccca gggcgaccgt gagccggtgt 



120 



accgogactg cgtactgeag tgcgaagagc agaaotgotc tgggggcgct ctgaatcact 180 



240 



tocgctcccg coagccaatc tacatgagtc tagcaggctg gacctgtcgg gaegaetgta 

agtatgagtg tatgtgggtc acagttgggc tetacatcca ggaaggtcac aaagtgcctc 300 

agttccatgg caagtggooc ttctcccggt tcotgttott tcaagagccg gcatcggccg 360 

tggcctogtt totcaatggc ctggccagco tggtgatgct ctgccgotac cgoaecttog 420 

tgooagcctc otooccoatg taocacaoct gtgtggcctt cgootgggtg tccctcaatg 480 

catggttctg gtccacagtc ttoeacacca gggacactga cctcaoagag aaaatggact 540 

acttctgtgc otocactgtc atectacact eaatctacct gtgctgogtc agctttctgg 600 

aagatgacag cctgtaootg ctgaaggaat cagaggacaa gttcaagotg gactgaagac 660 

cttggagcga gtotgcccca gtggggatcc tgoccccgee ctgctggoct cccttctcco 720 

otcaaccctt gagatgattt tctcttttca acttottgaa cttggacatg aaggatgtgg 780 

gcccagaatc atgtggccag cccaccccct gttggccctc accagccttg gagtctgttc 840 

tagggaaggc ctcccagcat ctgggactcg agagtgggea gcooetctac etcctggagc 900 

tgaactgggg tggaactgag tgtgttctta gctctaccgg gaggacagct gcctgtttoc 960 

tccccaccag cctcctooco acatccocag ctgcctggct gggtcctgaa gccotctgtc 1020 

tacctgggag accagggacc acaggcctta gggatacagg gggtcccctt ctgttaceac 1080 

cccccaccct cotccaggac accactaggt ggtgctggat gcttgttctt tggcoagcca 1140 

aggttcacgg cgattcteec catgggatct tgagggaoca agotgotggg attgggaagg 1200 

agtttcaccc tgaccgttgc cctagccagg ttcccaggag gcctcaccat acfcccctttc 1260 

agggccaggg ctccagcaag cccagggcaa ggatoctgtg otgctgtctg gttgagagcc 1320 

tgccaccgtg tgtcgggagt gtgggccagg ctgagtgcat aggtgacagg gccgtgagca 1380 

tgggcctggg tgtgtgtgag otoaggccta ggtgcgcagt gtggagacgg gtgttgtcgg 1440 

ggaagaggtg tggcttoaaa gtgtgtgtgt gcagggggtg ggtgtgttag cgtgggttag 1500 

gggaacgtgt gtgcgogtgo tggtgggcat gtgagatgag tgactgccgg tgaatgtgtc 1560 

cacagttgag aggttggago aggatgaggg aatcctgtca coatoaataa tcacttgtgg 1620 

agcgccagot ctgccoaaga cgccacctgg goggacagcc aggagctctc catggooagg 1680 

ctgcctgtgt gcatgttcco tgtctggtgc ooctttgccc gcctcctgca aacctcacag 1740 

ggtccccaca caaoagtgcc ctccagaagc agcccctcgg aggcagagga aggaaaatgg 1800 

ggatggctgg ggctctctcc atcctccttt totocttgcc ttcgcatggc tggccttccc 1860 

ctccaaaaco tccattccoo tgctgccagc ccctttgcca tagcotgatt ttggggagga 1920 

ggaaggggog atttgaggga gaaggggaga aagcttatgg ctgggtctgg tttcttccct 1980 

tcccagaggg tcttactgtt ccagggtggc oocagggcag gcaggggcca eaotatgoct 2040 

gegcootggt aaaggtgacc cctgooattt accagcagcc ctggcatgtt cctgccccac 2100 

aggaatagaa tggagggagc tocagaaact ttocatcooa aaggcagtot cogtggttga 2160 

agcagactgg atttttgcto tgcccctgac cccttgtccc tctttgaggg aggggagcta 2220 

tgctaggact ccaacctcag ggaotcgggt ggootgcgot agcttctttt gatactgaaa 2280 

acttttaagg tgggagggtg gcaagggatg tgottaataa atcaattcoa agootcaoot 2340 

a " 2341 



55 
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<210> 67 

<211> 2109 

<212> DHA 

<213> Homo sapiens 



<400> 67 

gattcggccg gagctgccag cggggaggct gcagccgcgg gttgttacag ctgotggagc 60 

agcagcggcc ccogotcccg ggaaccgttc ccgggccgtt gatcttcggc cccacacgaa 120 

cagcagagag gggoagcagg atgaatgtgg geacagcgca oagcgaggtg aaccccaaca 180 

cgcgggtgat gaacagccgt ggcatctggc tctoctacgt gctggccatc ggtctcctcc 240 

acatogtgct gctgagcatc ocgtttgtga gtgtoootgt cgtetggaco otcaccaaoc 300 

tcattcacaa catgggcatg tatatcttcc tgcacaoggt gaaggggaca ccotttgaga 360 

ccccggacca gggoaaggcg aggetgotaa cocactggga goagatggat tatggggtoc 420 

agttcacggc ctctcggaag ttcttgacca tcacaoooat cgtgctgtac ttcetcacea 480 

gcttotacao taagtaegac eagatocatt ttgtgctcaa caccgtgtcc ctgatgagog 540 

tgcttatccc caagctgccc cagctccacg gagtccggat ttttggaatc aataagtact 600 

gagagtgcag ccccttcccc tgcccagggt ggcaggggag gggtagggta aaaggcatgt 660 

gctgcaacac tgaagacaga aagaagaagc ctctggacac tgccagagat gggggttgag 720 

cctctggoct aatttccccc ctcgettccc ccagtagcca acttggagta gcttgtagtg 780 

gggttggggt aggcoocctg ggetatgacc ttttctgaat tttttgatot cttccttttg 840 

ctttttgaat agagactcoa tggagttggt catggaatgg gctgggotoe tgggctgaac 900 

atggaeeacg cagttgcgac aggaggecag gggaaaaacc cctgctcact tgtttgeoct 960 

caggcagcca aagoacttta acccctgcat agggageaga gggoggtacg gcttctggat 1020 

tgtttcaotg tgattcctag gttttttcga tgccafcgcag tgtgtgcttt tgtgtatgga 1080 

agcaagtgtg ggatgggtot ttgootttct gggtagggag ctgtctaatc caagtcccag 1140 

gcttttggca gcttctctgc aacccaccgt gggtcctggt tgggagtggg gagggtcagg 1200 

ttggggaaag atggggtaga gtgtagatgg cttggttcca gaggtgaggg ggccagggot 1260 

gctgccatcc tggcctggtg gaggttgggg agctgtagga gagctagtga gtcgagactt 1320 

agaagaatgg ggocacatag cagcagagga ctggtgtaag ggagggaggg gtagggaoag 1380 

aagctagacc caatotoott tgggatgtgg gcagggaggg aagcaggctt ggagggttaa 1440 

tttacccaca gaatgtgata gtaatagggg agggaggctg ctgtgggttt aaotootggg 1500 

ttggctgttg ggtagacagg tggggaaaag gcccgtgagt cattgtaagc acaggtccaa 1560 

cttggccctg actcctgogg gggtatgggg aagctgtgac agaaaogatg ggtgotgtgg 1620 

tcctctgcag gocctcaccc cttaacttcc tcatgcagac tggcactggg cagggcctot 1680 

eatgtggcag ccacatgtgg cgttgtgagg ccaococatg tggggtctgt ggtgagagtc 1740 

etgtaggatc octgctcaag cagcacagag gaaggggcaa gacgtggcct gtaggcactg 1800 

totcagcctg oagagaagaa agtgaggocg ggagootgag cctgggctgg agccttotcc 1860 

cctccccagt tggactaggg gcagtgttaa ttttgaaaag gtgtgggtcc ctgtgtcctt 1920 

ttcoaggggt ooaagggaac aggagaggtc actgggcctg ttttotcoct cotgaccctg 1980 

catotcccac cctgtgtatc atagggaact ttoaocttaa aatctttota agcaaagtgt 2040 
gaataggatt tttactccct ttgtacagta ttctgaggaa 
gtttctgtt 

<210> 68 

<211> 2423 

<212> DNA 

<213> Homo sapiens 



2100 
2109 



55 
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<400> 68 

gagagccgag ctagcgacga gcagtcgttg 
gcctagccgg agccgagagg tctcttgttc 
gcgcccagtc cccgtcccgg aactoccggg 
gctcgccgca gcgoccggcc ogggoogcac 
gagtgaccaa gagcaggttt gagatgttct 
aacttcccaa agaaotcctg ttacggatat 
gctgtgatoa ggtctccagg gcctggaatg 
gaattgacct atttgatttc cagagggata 
aacgatgtgg gggcttttta cgaaagttaa 
atgcattaag aaootttgca caaaactgca 
gtacaaagac aacagacgct acatgtacta 
accttgaott ggcttcctgt acatcaataa 
gatgtccact gttggagcag ttgaacattt 
ttcaagcact agtgaggggc tgtgggggte 
agctagaaga tgaagctctc aagtacatag 
acttgcagac ttgcttgcaa atcacagatg 
ataagttaca atccctttgt geotctggct 
ctotaggtca gaaatgccoa cggcttagaa 
cagatgtggg ctttaccaot ctagccagga 
aagagtgtgt tcagataaca gatagcaoat 
ttoaagtatt gagtctgtet caotgtgage 
ggaatggggc otgcgcccat gaccagctgg 
tcacagatgc atccctggag cacttgaaga 
atgactgoca gcaaatcaca cgggctggaa 
ttaaagtcca cgoctacttc gcacctgtca 
agcgcttctg cagatgctgc atoatcctat 
gagtatttaa tgacacttct agagctaccg 
gttctgagca agggttacaa agtgagggag 
acatacacat acacaccctt acccccatcc 
ttgtgatggc ttttttatoa agtagattgg 
ataagaaaat cataggccaa gatagggagg 
ctgtggtttt taaatttttg tctaggggtt 
tcctccgggg tcaagaaaag catggaaaga 
aaaatttctt tgcatcttag aaatggtaga 
tccttgttto catgcoaaoa tgctgagcat 
gtgttttagt atttggccca gaggtttcot 
aatgtaattc ttacacaoto aaattatcac 
oattctctgt tgctacccoo cacactottg 
tcactcaatt gttacccttt tgctgttgtc 
ttgggattac caaacatttt ttaaaaagat 
ttttaaaaaa aaaaaaaaaa aaa 

<210> 69 

<211> 1841 

<212> DNA 

<213> Homo sapiens 



cggccgccgg cgccgcggga ggtggtggag 
ccgtcocacg gteccggcgt cacccctccg 
cctgtcctgg gcccccggtc tgtgeactcc 
ccgccggccc catgaggagg gacgtgaacg 
caaatagtga tgaagctgta atcaataaaa 
tttcttttct agatgttgtt accctgtgcc 
ttctggetct ggatggcagt aactggcagc 
ttgagggccg agtagtggag aatatttcaa 
gtcttcgtgg atgtcttgga gtgggagaca 
ggaacattga agtaotgaat otaaatgggt 
gccttagcaa gttctgttcc aaactcaggc 
caaacatgto tctaaaagct ctgagtgagg 
cctggtgtga ooaagtaaoc aaggatggca 
tcaaggcctt attcttaaaa ggctgcacgc 
gtgcacactg ccotgaactg gtgactttga 
aaggtctcat taotatatgc agagggtgco 
gotccaacat caoagatgco atcctgaatg 
tattggaagt ggcaagatgt tctcaattaa 
attgcoatga acttgaaaag atggacctgg 
taatccaact ttctatacac tgtcotegac 
tgatcacaga tgatggaatt cgtcaectgg 
aggtgattga gctggacaac tgcccactaa 
gctgtcatag ccttgagcgg atagaactot 
tcaagagact caggacccat ttacccaata 
ctccaccccc atcagtaggg ggcagcagac 
gacaatggag gtggtcaacc ttggcgaact 
tggagtctct ccagtggaag caaocccagt 
ggcagtgtcc agatccccag agccacacat 
aotctagctt tgtgaccatg ggactgaagt 
taaaatttaa ocattcctgt tgaggtgccc 
ggcattccag caaaccccgt gttaatgota 
tctttgggga ttttagaaca gcatctgotg 
caatatatga tgtacccagg gaccagaaag 
oattcattgt gactaaagag ottctatgct 
gctcaoaaag aaggctcgtc cattcotcct 
aaatggttgc cttgaaatca ctgtggtcca 
tgtctgtagc acacttgtgo aootgtctta 
ctcagtctgt cacctgttca gtotgcttac 
gtgtttacag tttgcatttt gaatgattag 
attatcaata aatatttttt taattctaaa 



<400> 69 

agatgggaoc ggagggtgag 
tgagggaggg aggtggagaa 
aggaggggag agacaccgag 
agacacccag gccggggagc 
gggcgagocc cgagtcocga 



cccggcagag gcagagaoac 
ggacgggaga ggcagagaga 
acgcagagac actcaggagg 
gcgagggagc gaggcacaga 
gagcotgggg gcgogcccag 



acgcggagag gaggagaggc 
ggagacacgc agagacactc 
ggagagacac cgagacgcag 
cctggctcag cgagcgcggg 
cocgggegoc gaccctcctc 
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ocgctcccgc gccctcccct cggcgggcac ggtattttta toogtgcgog aacagccctc 360 

otcctoctct ogocgcacag cccgccgcct gcgcggggga gcccagcaca gaccgocgcc 420 

gggaccccga gtcgcgcaco ccagcccoae cgcccacccc gegcgccatg gaccccaagg 480 

accgcaagaa gatccagttc tcggtgoccg cgccccctag ccagctcgac ccccgocagg 540 

tggagatgat ooggcgcagg agacoaaoge ctgocatgct gttocggctc tcagagcact 600 

720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 



cctcaccaga ggaggaagco tccccccacc agagagcctc aggagagggg caccatotoa 
agtcgaagag acccaacccc tgtgcctaca caccaccttc gctgaaagct gtgcagcgca 
ttgctgagtc tcacctgcag tctatcagca atttgaatga gaaccaggcc tcagaggagg 
aggatgagct gggggagctt ogggagctgg gttatccaag agaggaagat gaggaggaag 
aggaggatga tgaagaagag gaagaagaag aggacagcca ggctgaagtc ctgaaggtca 
tcaggcagtc tgctgggcaa aagaoaaect gtggccaggg tctggaaggg ccctgggagc 
gcocaccccc tctggatgag tecgagagag atggaggctc tgaggaccaa gtggaagacc 
oagcactaag tgagcctggg gaggaacctc agcgccettc cocototgag cctggoacat 
aggcacccag octgcatcto ccaggaggaa gtggagggga eatcgctgtt ccccagaaac 
ocaotctato ctcaoeetgt tttgtgotct toccctcgcc tgotagggct gcggcttctg 
aottctagaa gactaaggot ggtctgtgtt tgcttgtttg occacctttg gctgataocc 
agagaacctg ggcaettgot gcctgatgco cacccctgcc agtcattoct ccattoaecc 
agcgggaggt gggatgtgag aeagcccaca ttggaaaatc cagaaaaaeg ggaacaggga 1380 



a y-yyy=yy »- san-nn-s — - —a- — =>= — = j " 

tttgcccttc acaattetac tccccagato ctctcccctg gaoaoaggag aeeeaoaggg 

caggacccta agatctgggg aaaggaggte ctgagaacet tgaggtaoco ttagatcctt 1500 

. . . , , , . j_j i 4- 4- rt+-*-^t-^^/iart 1560 

1620 



UOyyauuuui ayai.i- i_y ^ 7 ^ ~3 — 3 —3—33 -* 

ttctacocao tttcctatgg aggattccaa gtcaccactt ctctcaccgg cttotaocag 
ggtccaggao taaggcgttt ttctocatag cctcaacatt ttgggaatct tcoottaatc 

accottgctc ctcctgggtg cctggaagat ggactggcag agaoctcttt gttgcgtttt 1680 

gtgotttgat gccaggaatg ccgcctagtt tatgtccccg gtggggcaca cagcgggggg 1740 

cgccaggttt tccttgtcco ocagctgcto tgcccctttc cccttettcc ctgaotccag 1800 

gcctgaaecc ctcccgtgct gtaataaatc tttgtaaata a 1841 



<210> 70 

<211> 748 

<212> DNA 

<213> Homo sapiens 



<400> 70 

ggccgcgatg agcggggagc cggggcagao gtccgtagcg ccccctcccg aggaggtcga 60 
gccgggcagt ggggtccgca tcgtggtgga gtactgtgaa ccctgcggct tcgaggcgac 120 
ctaoctggag ctggccagtg ctgtgaagga goagtatccg ggcatcgaga togagtcgcg 180 
cctcgggggc acaggtgcct ttgagataga gataaatgga cagctggtgt totocaagct 240 
ggagaatggg ggotttooct atgagaaaga tctcattgag gccatccgaa gagccagtaa 300 
tggagaaacc ctagaaaaga tcaccaacag ccgtcctcco tgcgtcatcc tgtgactgoa 360 

420 
480 
540 
600 
660 



caggactetg ggttcctgct ctgttctggg gtccaaacct tggtctccct ttggtcctgc 
tgggagctcc ccctgcctot ttcccctaot tagctcctta gcaaagagao cctggcctcc 
actttgccct ttgggtaoaa agaaggaata gaagattccg tggccttggg ggcaggagag 
agacactctc catgaacact tctccagcca cctcataooo ccttcccagg gtaagtgccc 
acgaaagccc agtccactct tcgcctcggt aatacctgtc tgatgccaca gattttattt 
attctccoot aacccagggc aatgtcagct attggcagta aagtggcgct acaaacacta 720 



50 



55 
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EP 1 365 034 A2 



<210> 71 

5 <211> 795 

<212> DNA 

<213> Homo sapiens 

10 



ctgctggggo gccccattga caaatatgtt tttgagaaga tggaggagaa ggaggcaggc 
tgctcttctg aaacaggact tctocoaggc tctatctttg ccocatcagg tgccaattcc 
cttcttgaca tggccagcaa gatecgggag gacccactct tcatcatcag gaagaaggag 
gaggagaaaa aacgagaggt attaaataat ccagtgaaaa tgaagaaaat oaaagaattg 
ttgcaaatga gtctggaaaa aaaggagaag aagaaaaaga aggagaagaa aaagaagcac 
aagaaacata agcacagaag otcgagtagt gatcgttcca gcagcgagga tgagcacagt 
gcagggagat cacagaagaa gatggcaaat tcctcccctg ttttgtocaa agtccctgga 
tatggcttac aggtcoggaa ctetgaccgt aaccagggtc ttcagggtcc tctgacagca 



120 



<400> 71 

tacggctgcg agaagacgac agaagctaga cccaatctcc tttgggatgt gggcagggag 
ggaagcaggc ttggagggtt aatttaocoa cagaatgtga tagtaatagg ggagggaggc 

tgctgcgggt ttaactcctg ggttggctgt tgggtagaca ggtggggaaa aggcccgtga 180 

gtcattgtaa gcacaggtcc aacttggccc tgactcctgc gggggtatgg ggaagctgtg 240 

acagaaacga tgggtgctgt ggtcctotgc aggccotcac cccttaaatt cotoatacag 300 

actggcactg ggoagggcct cteatgtggo agccacatgt ggogttgtga ggccacccca 360 

tgtggggtct gtggtgagag tcctgtagga tccctgctca agcagcacag aggaaggggc 420 

aagaogtggc ctgtaggcac tgtctcagcc tgcagagaag aaagtgaggc cgggagcotg 480 

agcctgggct ggagecttct cccctcccea gttggactag gggcagtgtt aattttgaaa 540 

aggtgtgggt occtgtgtcc tcttooaggg gtooaaggga acaggagagg tcactgggce 600 

tgttttctco ctcctgaccc tgcatctccc aacecgtgta tcatagggaa otttcacctt 660 

aaaatotttc taagcaaagt gtgaatagga tttttactcc ctttgtacag tattotgaga 720 

aacgcaaata aaagggoaac atgtttatgt taaaaaaaaa aaaaagtacg caaaaaaaaa 780 

aaaaa 795 

<210> 72 

<211> 2356 

<212> DNA 

<213> Homo sapiens 



<400> 72 

ggcacgaggc cggaagtgac ctctagagcg gtggtgaaao tggcagttga cggotootgg 60 
gactagatcc cgcgaggtag oococgaact atttctctac gttttctctt gatcctcccg 120 
aaatcttcca gatcogcgta gtgaggaatc gtctooaocg tcatgggggg cggagacctg 180 
aatctgaaga agagctggca cccgcagacc ctcaggaatg tggagaaagt gtggaaggoc 240 

300 
360 



gagcagaagc atgaggatga gcggaagaag attgaggagc ttcagcggga gctgcgagaa 
gagagagccc gggaagagat gcagcgctat gcggaggatg ttggggccgt caagaaaaaa 
gaagaaaagt tggactggat gtaccagggt cetggtggga tggtgaaocg tgacgagtac 420 

- - — 480 

540 
600 
660 
720 
780 
840 
900 



gagcaaaaga gagggcatgg gatgaagaac cattccagat ccagaagctc ctcocactca 960 
cccccaagac atgccagcaa gaagagcacc agggaagcag ggtcacggga oaggaggtct 
cgatccctgg goagaaggtc acggtcccca agacccagca aaotgcacaa ctotaaggtg 
aacaggagag agacaggcca aactaggagc ccatcaccta aaaaagaggt ctacoaaagg 
cgacatgctc ccggatacac cagaaaactc tctgcagagg aattagagcg aaaacggcaa 1200 
gagatgatgg aaaacgccaa atggagggag gaggagagac tgaacatcct caagaggcat 
gctaaggatg aggaacggga gcagaggcta gagaagctgg actcocggga tgggaagttc 



1020 
1080 
1140 



1260 
1320 
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atccaccgca tgaagctgga gagtgcatct 
aatatctact otttacagag aacttcggta 
aaactgtacc ctctettatt ggttttcctg 
ttctctttat aagagttcaa atgacttctt 
tgaccctgct tcattgagtc ctgaaacagc 
tttgtgggaa actcagtaac tttgggtttt 
tcggtgggag tgcttgtgec actctggaag 
gccacttcct tcttacctgt gccaacagac 
cttcagaaac ctctctggtg tcacccagat 
tacttgattt agaagataat gtgacagaat 
gacagaatct ggaaaatcaa acaatacaaa 
ctttggatgt aaaattgtaa tgcgtatatg 
aaaaggttag ctttgtgaaa ataccttgtt 
acattgaace ttgatggcaa gtaatacaat 
tctggctggt ttaggaggag cctgggttta 
cactgcttgc agtctccaat gtaggcagtg 
agaaccaaat tgccaatatg ctccatggct 

<210> 73 

<211> 1646 

<212> DNA 

<213> Homo sapiens 



acttcctccc tggaggatcg ggtgaagcgg 
gctctggaga agaactttat gaaaagatga 
cattttccag ggaagotgot gaccccttaa 
tcacagatgt caaaccacca gtgttcaaag 
tcacttcctt tgagagctag tgtgacttgc 
gactotttaa cgggtgggca ctggaccatc 
gctgttccct ggggttgtga tgtttatcat 
ctatttcact gcctcagcgt acaocagaco 
agattgtgct tactgagaca aatgaacgtt 
gatgtcaggt taggtcaaag ccaagggagt 
aagccctaaa tgaactgtta actatttgat 
tacaaatgta caatttttao atgcttttaa 
tggtcaatga cfcttactggg taatagaacc 
aaggcaggcc agctcgtttt tctctctgaa 
tcgacgagat ctggagtatc tattctttto 
taaaggtata gtaaaatgat tttaggagtc 
cctaaaggaa aataaaatgg aagtttttaa 



<400> 73 

gtggaatgtc atcagttaag gctattttca 
ccatctggat gtatacatgc aggtcacagg 
cctgacacct caggctgcca aatgtggaag 
caaaaactga aattggcttc tgtttctgag 
aggaaatcac aagaattgta gttaaggaga 
gaagccttgt tgatgctgat agattccgct 
tctttggatg ccggcactac acaacaggcc 
gggacaagtg gttagatgaa ctggattctg 
ttotggataa tgtagactca acgggagagt 
tttcaggcag tttccagggc ttccaocato 
cccagcagta tctggctacc cttgaaaaca 
tccgatcaat taatacgaga gaaaacctgt 
aggaggaaac cctgaaaagc gaccggcaat 
atctcagcta taaacacaag ggccaaaggg 
gctatcgagt aaagcagctt gtcttcocca 
cggaggattc cagaaacatg aaggagaagt 
tgacagagga gaagagaaaa gatgtgotaa 
atattcggca ggatctagag caaagagtat 
tggaggaccc agacaagcct ctcctaagca 
aagcgcgtgc aaaagccatt ctggacttcc 
agcagtttgt ggctgaggcc ctggagaagg 
aatctgtcat ggagcagaac tgggatgagc 
aocctgaggc acgaattcto tgtgcgctgt 
ctgaggggcc tacctctgtc tcttcctaac 
gggttttccc tttaocagtc tgtcotoact 
ggacctcttt aaaacaagca gccaaccatt 
catocaaaac cagcctggat ttcatacatg 
tgttaaaaaa aaaaaaaaaa aaaaaa 



tttcttttgt ggatottoag ttgcttcagg 
gaatatgatg gcttagcttg ggttcagagg 
atttaaatac ttgaaccaat accctcctcc 
ttggtccagg cgcaatgttc agcgtatttg 
tggatgctgg aggggatatg attgccgtta 
gcttccatct ggtgggggag aagagaactt 
tcaccctgat ggacattctg gaoacacatg 
ggctccaagg tcaaaaggct gagtttcaaa 
tgatagtgag attacccaaa gaaataacaa 
agaaaatcaa gatatcggag aaccggatat 
ggaagctgaa gagggaacta cccttttcat 
atctggtgac agaaactctg gagacggtaa 
ataaattttg gagccagato tctcagggcc 
aagtgacoat ccccccaaat cgggtcctga 
acaaggagac gatgagaaag tctttgggtt 
tggaggacat ggagagtgtc ctoaaggacc 
actccctcgc taagtgcctc ggoaaggagg 
otgaggtcct gatttccggg gagctacaca 
gcctttttaa tgctgctggg gtcttggtag 
tggatgccct gctagagctg tctgaagagc 
ggacccttcc tctgttgaag gaccaggtga 
tggooagcag tootcctgac atggactatg 
atgttgttgt ctctatcctg otggagctgg 
tacaaaagcc ctttctcccc acaagcctct 
gccatcgcoa ctaccatcct gtcaccagtg 
etttgatgta tcccattcgc tccatgttaa 
gacttctgat taaaagtggc aggttgtgca 
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<210> 74 
<211> 3340 
<212> DNA 



<213> Homo sapiens 



<400> 74 

cgggogccca gagacagcgc cgcctcagat atcctgctgg atgacattgt ccttaoccat 60 
tctctcttcc tcccgacgga gaaatttctg caggagctac accagtactt tgttogggoa 120 

180 
240 
300 
360 
420 
480 
540 
600 



ggaggcatgg agggeeetga agggctgggc cggaagcaag cctgtctago catgcttctc 
catttcttgg acacctacca ggggctgctt eaagaggaag agggggccgg ccacatcatc 
aaggatctat aoctgctaat tatgaaggac gagtcccttt accagggcct ccgagaggac 
actctgaggc tgcacoagct ggtggagacg gtggaaotaa agattccaga ggagaaccag 
ccacccagca agcaggtgaa gceactcttc cgccaottcc gccggataga ctcctgtctg 
cagacccggg tggccttccg gggctctgat gagatcttet gccgtgtata catgcctgac 
cactcttatg tgaocataog cagccgcctt tcagcatctg tgcaggacat tetgggctct 
gtgacggaga aacttcaata ttcagaggag cccgcggggc gtgaggattc cctcatcctg 
gtagctgtgt cctcctctgg agagaaggtc cttctcoagc ccactgagga ctgtgttttc 660 
accgcactgg gcatcaacag ccacctgttt gcctgtaeto gggacagcta tgaggctctg n " J " 
gtgceectcc cogaggagat ccaggtctcc cctggagaca cagagatcca ccgagtggag 
cctgaggacg ttgccaaeca octaactgcc ttccaotggg agctgttccg atgtgtgcat 
gagctggagt tcgtggacta cgtgttccac ggggagogcg gocgccggga gacggccaac 
ttggagctgo tgctgcagcg ctgcagcgag gtcacgcact gggtggcoao cgaagtgctg 
ctctgcgagg cocogggcaa gcgcgcgcag ctgctcaaga agttcatcaa gatcgcggcc 1020 
ctctgoaagc agaaccagga cctgotgtct ttctacgccg tggtcatggg gctggacaac 
gccgctgtca gccgccttcg actcacctgg gagaagctgc cagggaaatt caagaacttg 
tttcgcaaat ttgagaacct gacggacccc tgcaggaaoc acaaaagcta ccgagaagtg 
atctccaaaa tgaagccccc tgtgattccc ttcgtgcctc tgatcctcaa agacctgact 1260 
- 1320 

1380 
1440 
1500 
1560 
1620 



720 
780 
840 
900 
960 



1080 
1140 
1200 



ttcctgcacg aagggagtaa gacccttgta gatggtttgg tgaacatcga gaagctgcat 
tcagtggccg aaaaagtgag gacaatccgc aaataccgga gccggcccct ttgcctggac 
atggaggcat cccccaatca cctgcagacc aaggcctatg tgcgccagtt tcaggtcatc 
gacaaccaga acctcetctt cgagctctcc tacaagctgg aggcaaacag tcagtgagag 
tggaggctcc agtcagaccc gccagatcct tgggcacotg gcaotcaagc actttgcacg 
atgtctcaac caacatctga catctttccc gtggagcaao ttootgctoo acgggaaaga 
ggtcgatgga tttaoccctg gacccataag tctgttcatc ctgotgaagt cccctcccca 1680 
ttgctccttc aagccaaaac tacactttgc tggttcctgt cccctctgag aaaggggata 1740 
gaaagctcct tcctctatgt cctcccatcg agatctgttc tggggatgga gcttccaact 1800 
tcctcttgca goaggaaaga atgctgctca cccttotgtc ttgcagagtg ggattgtggg 1860 
agggattggc agccttcttc tccaccacct gtccagcttc ttcctggtca gggctgggae 1920 
ccccaggaat attatgttgo cgtgtgtgtg tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg 1980 
tgtgtcttct tttagggagc aggagtgcat ctggtaattg agggtggatg ttgtgtgtgc 
tggggagggg tcottctgtt tggtgctacc cttgtctact ctgcccctgg atggtgcggg 
gtgctttotc cacccccaoa ctoootgctc agctcctogt gctgccctgc atgcccaggc 
ttgtgagcca aggtgctttt tggggcaggg agtagoagca ggtgggaggg gttacccatc 
agcoottgca agtcccccac tcaggcotct ggaaggtcca gggatgggct otgatgagag 
ggtaaaagat gctcagggaa acacaggcct cagotgccta gaggaccctc cccctgcctt 
gcagtgggct cgggtagagc agtatcagga gctagggttg tctgctgccc acactcctgc 
tttttgggat atctaactgc taaggaggga gttgacatcc ocottotggc tcatgtgtot 
gacaccaaca aoatggtctc cgtcoctctc tcttagactc tccctttgtc ctccccatag 
agctggggtg gggtggatcc ctataotggg gcaggcagcc ccaaagtggg ggagggggat 
ggcagagact gtaaaggcgo cactggactc tggcaaggcc tttattacct ttactccctc 
cctctcccat caccagcctc aaggcctgag gggtgcaggg gctoctggoa gctactgggt 
gaggtttcct ggcacagact cacccttctt tctggcacca ctctttccct tttgaagaga 
cagcaacagc cgtagoaaaa gcagatgctg ctcctgctat gagggtgtat atatttttta 
cccaaagctc tggaattgta catttatttt ttaaaactca aagagggaaa gagccttgta 



2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
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tcatatgtga acattgtatc ataggtaatg ttgtacagac ccttttatac agtgatctgt 2940 

cttgttcctg cagcaaaaat cotctatgga cataggaggt gctgtgtccc atgccttott 3000 

gccctgacag tgtccoatgg gcccccttct gctccotgcc occtccctgc taotgctgat 3060 

gcactgtcot ctccctgcag ccootggctt cccagccttc ctcctgaccc ottccaacag 3120 

ccttggaact ccagctgcca ccacoctotg ggtcggacac tgggacccac tggcccagtc 3180 

ttggctgctg cttacoccta gccttgatga otgcccaggg acccccagcc ccctcccgtt 3240 

gccctgcagc tttaacagag tgaaceatgt gtattgtaca ggogcggttg tcattgoaga 3300 

aaccgctggg tggagaagaa gccgataaag tctatgaatc 3340 

<210> 75 

<211> 4005 

<212> DNA 

<213> Homo sapiens 



<400> 75 

gggcaacagt ctgcceaact gtggacacca gatcctggga gctcctggtt agcaagtgag 60 

atctctggga tgtcagtgag gctggttgaa gaccagaggt aaactgcaga ggtcaccaoc 120 

occaocatgt cccaggtgat gtccagccea ctgctggcag gaggccatgo tgtoagattg 180 

gcgcettgtg atgagcccag gaggaccctg cacccagcac ccagccccag cotgccaccc 240 

cagtgttctt actacaccac ggaaggotgg ggagcccagg ccctgatggc coccgtgccc 300 

tgoatggggc cocctggccg actccagcaa gccccacagg tggaggccaa agcoacctgc 360 

ttcctgccgt cccotggtga gaaggccttg gggaecccag aggacettga otcctacatt 420 

gaettotcac tggagagcct aaatoagatg atcctggaac tggaccccac cttocagctg 480 

cttcccccag ggaotggggg otcccaggct gagctggccc agagcaccat gtcaatgaga 540 

aagaaggagg aatctgaagc cttggacata aagtacatcg aggtgacctc cgccagatca 600 

aggtgccaog attggcccca gcactgctcc agcccctctg tcaccccgcc cttcggctcc 660 

■ ■ • • -<- 720 



cctcgcagtg gtggcctcct cctttccaga gacgtccccc gagagacacg aagcagcagt 
gagagcctca tcttctctgg gaaccagggc agggggcacc agcgccctct gcccccctca 780 

840 
900 
960 



gagggtctct cccctcgacc cccaaattcc cccagcatct caatcccttg catggggagc 

aaggcctcga gcccccatgg tttgggctcc ccgctggtgg cttctccaag actggagaag 

cggctgggag goctggcccc acagcggggc agcaggatot ctgtgctgtc agccagcoca 

gtgtctgatg tcagctatat gtttggaagc agccagtccc toctgoaotc cagcaactcc 1020 



1080 



agccatcagt catcttocag atccttggaa agtccagcca actottcctc cagcotccac 
agccttggct oagtgtcoct gtgtacaaga cccagtgact tccaggctcc cagaaacccc 1140 
accctaacca tgggcaaacc cagaacaccc cactctocac cactggccaa agaacatgcc 1200 
aacatctqcc occcatccat caccaactcc atggtggaca tacocattgt gctgatcaac 1260 
' "" - 1320 

1380 
1440 



ggctgoccag aacoagggtc ttatccaccc cagcggaccc caggacacca gaaotcogtt 
caacctggag ctgcttctcc cagcaacccc tgtccagcca ccaggagcaa cagccagacc 
ctgtcagatg ccccctttac oaoatgccaa gagggtooog ccagggacat gcagcocacc 
atgaagttcg tgatggacac atctaaatac tggtttaagc caaacatoao ccgagagcaa 1500 
gcaatcgagc tgctgaggaa ggaggagcca ggggcttttg tcataaggga cagctottca 1560 
taccgaggct ccttoggcct ggccotgaag gtgcaggagg ttcocgcgtc tgctcagaat 1620 
cgaccaggtg aggacagcaa tgacctcatc ogacaottco tcatcgagta gtctgacaaa 1680 
ggagtgcatc tcaaaggagc agatgaggag ccctactttg ggagootctc tgccttcgtg 1740 

1800 
1860 
1920 



tgccagcatt ocatcatggc cctggcoctg ccctgcaaao tcaccatccc acagagagaa 
ctgggaggtg cagatggggc ctcggaotot acagacagcc cagcctcctg ocagaagaaa 
tctgcgggct gccacaccct gtacctgagc tcagfcgagcg tggagaccct gactggagcc 

ctggcogtgc agaaagccat ctccaccacc tttgagaggg acatcctocc cacgcccaco 1980 

gtggtccact tcgaagtcac agagcagggc atcactctga ctgatgtcca gaggaaggtg 2040 

tttttccggc gccattaccc actcaccacc ctccgcttct gtggtatgga ccctgagcaa 2100 

cggaagtggc agaagtactg caaaccctcc tggatctttg ggtttgtggc caagagccag 2160 

acagagcctc aggagaacgt atgccacctc tttgcggagt atgacatggt ccagccagcc 2220 

tcgcaggtca tcggcctggt gactgctctg ctgoaggacg cagaaaggat gtaggggaga 2280 

gactgootgt gcacctaaco aacacctcoa ggggctcgot aaggagccco cctccacccc 2340 
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ctgaatgggt gtggcttgtg gooatattga cagaccaatc tatgggaota gggggattgg 240( 



246( 



catcaagttg acacccttga acctgotatg gccttcagca gtcaccatca tccagaccoc 
ccgggoctca gtttcctcaa tcatagaaga agaccaatag acaagatcag ctgttcttag 252( 
atgctggtgg gcatttgaaa atgctcctcc atgattctga agcatgcaca octctgaaga 258( 
ccootgcatg aaaataacct ccaaggacoc totgacccca tcgacctggg occtgoccac 
acaacagtot gagcaagaga cctgcagccc ctgtttcgtg gcagacagca ggtgcctgge 
ggtgacccac ggggctcctg gcttgoaget ggtgatggtc aagaactgac tacaaaacag 
gaatggatag actctatttc cttccatatc tgttcctctg ttecttttcc aactttctgg 282( 
gtggottttt gggtccaccc agccaggatg ctgcaggcca agctgggtgt ggtatttagg 288( 
gcagctcagc agggggaact tgtcccoatg gtcagaggag aocoagctgt octgcacccc 294( 
cttgcagatg agtatcaccc catettttct tteeacttgg tttttatttt tatttttttt 300( 
gagacagagt ctcactgtca cccaggctga actgcagtgg tgtgatctag gcteaatgca 30 6C 
aoctccacct cccaggttca ageaattatc ctgcctcagg ctcccgagta gctgggatta 312( 
caggcatgtg caactoaccc agotaatttt gtatttttag tagagacagg gtttcaccat 
gttggccagg otggtcttga actcctgacc gcaggtaatc cacctgcttc ggcctcccaa 
agtgctggga ttacaggcgo aagccaccca gcocagcttc tttccattoc ttgataggcg 
agtattccaa agctggtatc gtagetgcee taatgttgea tattaggcgg cgggggcaga 
gataagggcc atctotctgt gattctgcct cagatcctgt cttgctgagc cctcooccaa 
cccacgctcc aacacacaca cacacacaca cacacacaca cacacacaca cacacacaca 348C 
cacgccccto tactgctatg tggcttoaae cagoctcaca gccacacggg ggaagcagag 354C 
agtcaagaat gcaaagaggc cgcttcccta agaggcttgg aggagctggg ctctatccoa 360C 
cacccacccc caccccaccc ceacacagcc tccagaagct ggaaccattt ctcccgcagg 366C 
octgagttcc taaggaaacc accctaccgg ggtggaaggg agggtcaggg aagaaaccca 372 C 
ctcttgctct acgaggagca agtgcctgcc ccctcccagc agccagccet gccaaagttg 378C 
cattatcttt ggccaaggct gggcctgacg gttatgattt cagccctggg cctgcaggag 384C 
aggctgagat cagcccaceo agccagtggt cgagcactgc occgccgoca aagtctgcag 390C 
aatgtgagat gaggttctca aggtcacagg ccccagtecc agcctggggg ctggcagagg 3960 
cccccatata ctotgataca gctoctatca tgaaaaataa aatgt 

<210> 76 

<211> 1093 

<212> PRT 

<213> Homo sapiens 



264( 
270( 
276( 



318C 
324C 
330( 
336C 
342C 



400S 



<400> 76 
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Cys Ser 
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15 
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Gly His 
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20 










25 






30 


Ala 


Val 


His 


Gin 


Ala 


Cys 


Tyr 


Gly 


He 


Val 


Gin Val 


Pro Thr Gly Pro 






35 










40 








45 


Trp 


Phe 


Cys 


Arg 


Lys 


Cys 


Glu 


Ser 


Gin 


Glu 


Arg Ala 


Ala Arg Val Arg 




50 










55 








60 




Cys 


Glu 




Cys 


Pro 


His 


Lys 


Asp 


Gly 


Ala 


Leu Lys 


Arg Thr Asp Asn 


65 










70 










75 


80 


Gly 


Gly 




Ala 


His 


Val 
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Cys 


Ala 


Leu 


Tyr He 


Pro Glu Val Gin 










85 










90 




95 


Phe 


Ala 


Asn 


Val 


Leu 


Thr 


Met 


Glu 


Pro 


He 


Val Leu 


Gin Tyr Val Pro 








100 










105 






110 


His 


Asp 


Arg 


Phe 






Thr 


Cys 




He 


Cys Glu 


Glu Thr Gly Arg 




115 










120 








125 


Glu 


Ser 




Ala 


Ala 


Ser 


Gly 


Ala 


Cys 


Met 




Asn Arg His Gly 




130 










135 








140 
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Cys 


Arg 


Gin 


Ala 


Phe 


His 


Val 




Cys 






Gly 
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Glu 


Glu 


Val 
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Cys 
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er y 
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Gly 


Gly 
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210 
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Ser Arg 
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Ser 
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Gin 
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p 15 
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230 








235 
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Glu 


Arg 


Leu 


Lys 


Gin Lys 


His Lys 


Lys 
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p 5 ° V 1 
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Ala Asp 
P 


Lys 


Val 


Ser 
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Ser 
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Lys 
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Ser Ser 


y 




305 
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Val 


Ser 


Ser 


Phe 
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Ser 


Ala 


Ser 


Ser 


Ser Ser 
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Ser Ser 


Ser 


Ser Ser 


Ser 
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Val Ser 


Ser Leu 




Ser er 
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Pr ° If " 
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Gl 
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430 




Thr 


Gly 


Ser 


Gly Ala 
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Pro 
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Thr Thr Gin Val Phe Ser Leu Ala Gly Ser Thr Phe Ser Leu Pro Ser 

610 615 620 

Thr His He Phe Gly Thr Pro Met Gly Ala Val Asn Pro Leu Leu Ser 
625 630 635 640 

Gin Ala Glu Ser Ser His Thr Glu Pro Asp Leu Glu Asp Cys Ser Phe 

645 650 655 

Arg Cys Arg Gly Thr Ser Pro Gin Glu Ser Leu Ser Ser Met Ser Pro 

660 665 670 

He Ser Ser Leu Pro Ala Leu Phe Asp Gin Thr Ala Ser Ala Pro Cys 

675 680 685 

Gly Gly Gly Gin Leu Asp Pro Ala Ala Pro Gly Thr Thr Asn Met Glu 

690 ~ 695 700 

Gin Leu Leu Glu Lys Gin Gly Asp Gly Glu Ala Gly Val Asn He Val 
705 710 715 720 

Glu Met Leu Lys Ala Leu His Ala Leu Gin Lys Glu Asn Gin Arg Leu 

725 730 735 

Gin Glu Gin He Leu Ser Leu Thr Ala Lys Lys Glu Arg Leu Gin He 

740 745 750 

Leu Asn Val Gin Leu Ser Val Pro Phe Pro Ala Leu Pro Ala Ala Leu 

755 760 765 

Pro Ala Ala Asn Gly Pro Val Pro Gly Pro Tyr Gly Leu Pro Pro Gin 

770 775 780 

Ala Gly Ser Ser Asp Ser Leu Ser Thr Ser Lys Ser Pro Pro Gly Lys 
785 790 795 800 

Ser Ser Leu Gly Leu Asp Asn Ser Leu Ser Thr Ser Ser Glu Asp Pro 

805 810 815 

His Ser Gly Cys Pro Ser Arg Ser Ser Ser Ser Leu Ser Phe His Ser 

820 825 830 

Thr Pro Pro Pro Leu Pro Leu Leu Gin Gin Ser Pro Ala Thr Leu Pro 

835 840 845 

Leu Ala Leu Pro Gly Ala Pro Ala Pro Leu Pro Pro Gin Pro Gin Asn 

850 855 860 

Gly Leu Gly Arg Ala Pro Gly Ala Ala Gly Leu Gly Ala Met Pro Met 
865 870 875 880 

Ala Glu Gly Leu Leu Gly Gly Leu Ala Gly Ser Gly Gly Leu Pro Leu 

885 890 895 

Asn Gly Leu Leu Gly Gly Leu Asn Gly Ala Ala Ala Pro Asn Pro Ala 

900 905 910 

Ser Leu Ser Gin Ala Gly Gly Ala Pro Thr Leu Gin Leu Pro Gly Cys 

915 920 925 

Leu Asn Ser Leu Thr Glu Gin Gin Arg His Leu Leu Gin Gin Gin Glu 

930 935 940 

Gin Gin Leu Gin Gin Leu Gin Gin Leu Leu Ala Ser Pro Gin Leu Thr 
945 950 955 960 

Pro Glu His Gin Thr Val Val Tyr Gin Met He Gin Gin He Gin Gin 

965 970 975 

Lys Arg Glu Leu Gin Arg Leu Gin Met Ala Gly Gly Ser Gin Leu Pro 

980 985 990 

Met Ala Ser Leu Leu Ala Gly Ser Ser Thr Pro Leu Leu Ser Ala Gly 

995 1000 1005 

Thr Pro Gly Leu Leu Pro Thr Ala Ser Ala Pro Pro Leu Leu Pro 

1010 1015 1020 

Ala Gly Ala Leu Val Ala Pro Ser Leu Gly Asn Asn Thr Ser Leu 

1025 1030 1035 

Met Ala Ala Ala Ala Ala Ala Ala Ala Val Ala Ala Ala Gly Gly 

1040 1045 1050 

Pro Pro Val Leu Thr Ala Gin Thr Asn Pro Phe Leu Ser Leu Ser 
1055 1060 1065 
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Gly Ala Glu Gly Ser Gly Gly Gly Pro Lys Gly Gly Thr Ala Asp 
1070 1075 1080 

Lys Gly Ala Ser Ala Asn Gin Glu Lys Gly 
1085 1090 
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Leu Leu Arg His lie Lys Leu His Thr Gly Glu Lys Pro Phe Lys Cys 

165 170 175 

His Leu Cys Asn Tyr Ala Cys Gin Arg Arg Asp Ala Leu Thr Gly His 

180 185 190 

Leu Arg Thr His Ser Val Glu Lys Pro Tyr Lys Cys Glu Phe Cys Gly 

195 200 205 

Arg Ser Tyr Lys Gin Arg Ser Ser Leu Glu Glu His Lys Glu Arg Cys 

210 215 220 

Arg Thr Phe Leu Gin Ser Thr Asp Pro Gly Asp Thr Ala Ser Ala Glu 
225 230 235 240 

Ala Arg His He Lys Ala Glu Met Gly Ser Glu Arg Ala Leu Val Leu 

245 250 255 

Asp Arg Leu Ala Ser Asn Val Ala Lys Arg Lys Ser Ser Met Pro Gin 

260 265 270 

Lys Phe He Gly Glu Lys Arg His Cys Phe Asp Val Asn Tyr Asn Ser 

275 280 285 

Ser Tyr Met Tyr Glu Lys Glu Ser Glu Leu He Gin Thr Arg Met Met 

290 295 300 

Asp Gin Ala He Asn Asn Ala He Ser Tyr Leu Gly Ala Glu Ala Leu 
305 310 315 320 

Cys Pro Leu Val Gin Thr Pro Pro Ala Pro Thr Ser Glu Met Val Pro 

325 330 335 

Val He Ser Ser Met Tyr Pro He Ala Leu Thr Arg Ala Glu Met Ser 

340 345 350 

Asn Gly Ala Pro Gin Glu Leu Glu Arg Lys Ser He Leu Leu Pro Glu 

355 360 365 

Lys Ser Val Pro Ser Glu Arg Gly Leu Ser Pro Asn Asn Ser Gly His 

370 375 380 

Asp Ser Thr Asp Thr Asp Ser Asn His Glu Glu Arg Gin Asn His He 
385 390 395 400 

Tyr Gin Gin Asn His Met Val Leu Ser Arg Ala Arg Asn Gly Met Pro 

405 410 415 

Leu Leu Lys Glu Val Pro Arg Ser Tyr Glu Leu Leu Lys Pro Pro Pro 
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He Cys Pro Arg Asp Ser Val Lys Val He Asp Lys Glu Gly Glu Val 
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Met Asp Val Tyr Arg Cys Asp His Cys Arg Val Leu Phe Leu Asp Tyr 
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Glu Cys Asn Met Cys Gly Asp Arg Ser His Asp Arg Tyr Glu Phe Ser 

485 490 495 

Ser His He Ala Arg Gly Glu His Arg Ser Leu Leu Lys 
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Val 


Val Ser Gly Thr 


Cys Ala Glu 


130 




135 




140 




Gin 


Met 


Tyr Gly Met 


Cys Glu Ser 




Trp Glu Pro Asn 


Met Asp Pro 


145 




150 




155 


160 


Asp 


His 


Leu Phe Glu 


Thr He Ser 


Gin 


Ala Met Leu Asn 


Ala Val Asp 




165 






170 


175 


Arg 


Asp 


Ala Val Ser 


Gly Met Gly 


Val 


He Val His He 


He Glu Lys 


180 




185 




190 


Asp 


Lys 


lie Thr Thr 


Arg Thr Leu 


Lys 


Ala Arg Met Asp 




195 


200 




205 





<210> 83 

<211> 190 

<212> PRT 

<213> Homo sapiens 

<400> 83 

Leu Thr Arg Ser Cys Ser Thr Cys Cys Pro Ala Val Ala Cys Leu Val 

15 10 15 

Gly Arg Gly Val Val Thr Ser Gly Ala Met His Gin Cys Trp Gly Glu 

20 25 30 

Glu Met Leu Gin Gly Met Leu Leu Trp Gly Trp Ala Thr Cys Pro Leu 

35 40 45 

Ser Asn Pro Gly Arg Trp Gly Arg Thr Val Gly Leu Gin His Pro Ala 

50 55 60 

Val Val Ser Ala Phe Arg Ala Leu Leu Leu Leu Met Leu Thr Val His 
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Val 


Ser 


Tyr 


Leu 


Ser 
85 


Leu 


He 


Arg 




Ala 


Asn 


Val 


Ala 


He 


Gly 


Leu 


Val 


5 








100 












Cys 


Leu 


Trp 


Asn 


Gin 


Arg 


Arg 


Leu 








115 










120 




Val 


Val 


Leu 


Leu 


Leu 


Gin 


Gly 


Leu 






130 










135 




10 


Pro 


Pro 


Leu 


Phe 


Trp 


Val 


Leu 


Asp 




145 










150 








Thr 


He 


Pro 


Val 


His 


Val 




Phe 












165 










Leu 


Tyr 






Lys 


Glu 


Ser 


Glu 



180 



<210> 84 

<211> 368 

20 <212> PRT 

<213> Homo sapiens 



Phe 


Asp 


Tyr 


Gly Tyr 


Asn Leu 


Val 




90 






95 




Asn 


Val 


Val 


Trp Trp 


Leu Ala 


Trp 


105 








110 




Pro 


His 


Val 


Arg Lys 


Cys Val 


Val 








125 






Ser 


Leu 


Leu 


Glu Leu 


Leu Asp 


Phe 








140 






Ala 


His 


Ala 


He Trp 


His He 


Ser 






155 






160 


Phe 


Ser 


Phe 


Leu Glu 


Asp Asp 


Ser 




170 






175 




Asp 


Lys 


Phe 


Lys Leu 


Asp 




185 








190 





<400> 84 






















Ala 


Pro 


Pro 


Pro 


Ala 


Ala 


Ser 


Gin Gly 


Glu 


Arg 


Met Ala 


Gly Leu 


Ala 


1 








5 








10 






15 




Ala 


Arg 


Leu 


Val 


Leu 


Leu 


Ala 


Gly Ala 


Ala 


Ala 


Leu Ala 


Ser Gly 


Ser 






20 








25 








30 




Gin 


Gly 


Asp 


Arg 


Glu 


Pro 


Val 




Asp 


Cys 


Val Leu 


Gin Cys 


Glu 




35 










40 






45 






Glu 


Gin 

50 


Asn 


Cys 


Ser 


Gly 


Gly 
55 


Ala Leu 


Asn 


His 


Phe Arg 
60 


Ser Arg 


Gin 


Pro 


He 


Tyr 


Met 


Ser 


Leu 


Ala 


Gly Trp 


Thr 


Cys 


Arg Asp 


Asp Cys 


Lys 


65 








70 








75 






80 


Tyr 


Glu 


Cys 


Met 


Trp 


Val 


Thr 


Val Gly 


Leu 


Tyr 


Leu Gin 


Glu Gly 


His 






85 








90 






95 




Lys 


Val 


Pro 


Gin 


Phe 


His 


Gly 


Lys Trp 


Pro 


Phe 


Ser Arg 




Phe 






100 








105 








110 




Phe 


Gin 


Glu 
115 


Pro 


Ala 


Ser 


Ala 


Val Ala 
120 


Ser 


Phe 


Leu Asn 
125 


Gly Leu 


Ala 


Ser 


Leu 


Val 


Met 


Leu 


Cys 


Arg 


Tyr Arg 


Thr 


Phe 


Val Pro 


Ala Ser 


Ser 




130 










135 








140 






Pro 


Met 


Tyr 


His 


Thr 


Cys 


Val 


Ala Phe 


Ala 


Trp 


Val Ser 


Leu Asn 


Ala 


145 








150 








155 






160 


Trp 


Phe 


Trp 


Ser 


Thr 


Val 


Phe 


His Thr 


Arg 


Asp 


Thr Asp 


Leu Thr 


Glu 








165 








170 






175 




Lys 


Met 


Asp 


Tyr 


Phe 


Cys 


Ala 


Ser Thr 


Val 


He 


Leu His 


Ser He 


Tyr 






180 








185 








190 




Leu 




Cys 


Val 


Arg 


Thr 


Val 


Gly Leu 


Gin 


His 


Pro Ala 


Val Val 


Ser 




195 










200 






205 






Ala 


Phe 


Arg 


Ala 




Leu 


Leu 


Leu Met 


Leu 




Val His 


Val Ser 


Tyr 




210 








215 








220 








Ser 


Leu 


He 


Arg 


Phe 


Asp 


Tyr Gly 


Tyr 


Asn 


Leu Val 


Ala Asn 


Val 


225 










230 








235 






240 


Ala 


He 


Gly 


Leu 


Val 


Asn 


Val 


Val Trp 


Trp 


Leu 


Ala Trp 


Cys Leu 


Trp 








245 








250 






255 





184 
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Asn Gin Arg Arg Leu Pro His Val Arg Lys Cys Val Val Val Val Leu 

260 265 270 

Leu Leu Gin Gly Leu Ser Leu Leu Glu Leu Leu Asp Phe Pro Pro Leu 

275 280 285 

Phe Trp Val Leu Asp Ala His Ala lie Trp His He Ser Thr He Pro 

290 295 300 

Val His Val Leu Phe Phe Ser Phe Leu Glu Asp Asp Ser Leu Tyr Leu 
305 310 315 320 

Leu Lys Glu Ser Glu Asp Lys Phe Lys Leu Val Glu Ala Asp Trp lie 

325 330 335 

Phe Ala Leu Pro Leu Thr Pro Cys Pro Ser Leu Arg Glu Gly Ser Tyr 

340 345 350 

Ala Arg Thr Pro Thr Ser Gly Thr Arg Val Ala Cys Ala Ser Phe Phe 
355 360 365 



<210> 85 

<211> 190 

<212> PRT 

<213> Homo sapiens 



<400> 85 

Leu Thr Arg Ser Cys Ser Thr Cys Cys Pro Ala Val Ala Cys Leu Val 

X 5 10 15 

Gly Arg Gly Val Val Thr Ser Gly Ala Met His Gin Cys Trp Gly Glu 

20 25 30 

Glu Met Leu Gin Gly Met Leu Leu Trp Gly Trp Ala Thr Cys Pro Leu 

35 40 45 

Ser Asn Pro Gly Arg Trp Gly Arg Thr Val Gly Leu Gin His Pro Ala 

50 55 60 

Val Val Ser Ala Phe Arg Ala Leu Leu Leu Leu Met Leu Thr Val His 
65 70 75 80 

Val Ser Tyr Leu Ser Leu He Arg Phe Asp Tyr Gly Tyr Asn Leu Val 

85 90 95 

Ala Asn Val Ala He Gly Leu Val Asn Val Val Trp Trp Leu Ala Trp 

100 105 HO 

Cys Leu Trp Asn Gin Arg Arg Leu Pro His Val Arg Lys Cys Val Val 

115 120 125 

Val Val Leu Leu Leu Gin Gly Leu Ser Leu Leu Glu Leu Leu Asp Phe 

130 135 1^0 

Pro Pro Leu Phe Trp Val Leu Asp Ala His Ala He Trp His He Ser 



145 150 



155 160 



Thr He Pro Val His Val Leu Phe Phe Ser Phe Leu Glu Asp Asp Ser 

165 170 175 

Leu Tyr Leu Leu Lys Glu Ser Glu Asp Lys Phe Lys Leu Asp 
180 185 190 

<210> 86 
<211> 318 
<212> PRT 



185 
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<213> Homo sapiens 



<400> 86 

Met Ala Gly Leu Ala Ala Arg Leu Val Leu Leu Ala Gly Ala Ala Ala 

15 10 15 

Leu Ala Ser Gly Ser Gin Gly Asp Arg Glu Pro Val Tyr Arg Asp Cys 

20 25 30 

Val Leu Gin Cys Glu Glu Gin Asn Cys Ser Gly Gly Ala Leu Asn His 

35 40 45 

Phe Arg Ser Arg Gin Pro lie Tyr Met Ser Leu Ala Gly Trp Thr Cys 

50 " 55 60 

Arg Asp Asp Cys Lys Tyr Glu Cys Met Trp Val Thr Val Gly Leu Tyr 
65 70 75 80 

Leu Gin Glu Gly His Lys Val Pro Gin Phe His Gly Lys Trp Pro Phe 

85 90 95 

Ser Arg Phe Leu Phe Phe Gin Glu Pro Ala Ser Ala Val Ala Ser Phe 

100 105 HO 

Leu Asn Gly Leu Ala Ser Leu Val Met Leu Cys Arg Tyr Arg Thr Phe 

115 120 125 

Val Pro Ala Ser Ser Pro Met Tyr His Thr Cys Val Ala Phe Ala Trp 

130 135 140 

Val Ser Leu Asn Ala Trp Phe Trp Ser Thr Val Phe His Thr Arg Asp 
145 150 155 160 

Thr Asp Leu Gin Arg Lys Trp Thr Thr Ser Val Pro Pro Val Ser Tyr 

165 170 175 

Thr Gin Ser Thr Cys Ala Ala Ser Gly Pro Trp Gly Cys Ser Thr Gin 

180 185 190 

Leu Trp Ser Ser Ala Phe Arg Ala Leu Leu Leu Leu Met Leu Thr Val 

195 200 205 

His Val Ser Tyr Leu Ser Leu He Arg Phe Asp Tyr Gly Tyr Asn Leu 

210 215 220 

Val Ala Asn Val Ala He Gly Leu Val Asn Val Val Trp Trp Leu Ala 
225 230 235 240 

Trp Cys Leu Trp Asn Gin Arg Arg Leu Pro His Val Arg Lys Cys Val 

245 250 255 

Val Val Val Leu Leu Leu Gin Gly Leu Ser Leu Leu Glu Leu Leu Asp 

260 265 270 

Phe Pro Pro Leu Phe Trp Val Leu Asp Ala His Ala He Trp His He 

275 280 285 

Ser Thr He Pro Val His Val Leu Phe Phe Ser Phe Leu Glu Asp Asp 
290 295 300 

Ser Leu Tyr Leu Leu Lys Glu Ser Glu Asp Lys Phe Lys Leu 
305 310 315 

<210> 87 

<211> 226 

<212> PRT 

<213> Homo sapiens 



55 



186 
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<400> 87 

Met Ala Gly Leu Ala Ala Arg Leu Val Leu Leu Ala Gly Ala Ala Ala 

1 5 10 15 

Leu Ala Ser Gly Ser Gin Gly Asp Arg Glu Pro Val Tyr Arg Asp Cys 

20 25 30 

Val Leu Gin Cys Glu Glu Gin Asn Cys Ser Gly Gly Ala Leu Asn His 

35 40 45 

Phe Arg Ser Arg Gin Pro lie Tyr Met Ser Leu Ala Gly Trp Thr Cys 

50 55 60 

Arg Asp Asp Cys Lys Tyr Glu Cys Met Trp Val Thr Val Gly Leu Tyr 
65 70 75 80 

Leu Gin Glu Gly His Lys Val Pro Gin Phe His Gly Lys Trp Pro Phe 

85 90 95 

Ser Arg Phe Leu Phe Phe Gin Glu Pro Ala Ser Ala Val Ala Ser Phe 

100 105 HO 

Leu Asn Gly Leu Ala Ser Leu Val Met Leu Cys Arg Tyr Arg Thr Phe 

115 120 125 

Val Pro Ala Ser Ser Pro Met Tyr His Thr Cys Val Ala Phe Ala Trp 

130 135 140 

Val Ser Leu Asn Ala Trp Phe Trp Ser Thr Val Phe His Thr Arg Asp 
145 150 155 160 

Thr Asp Leu Thr Glu Lys Met Asp Tyr Phe Cys Ala Ser Thr Val lie 

165 170 175 

Leu His Ser He Tyr Leu Cys Cys Val Arg Pro Gly Gin Arg Gly Val 

180 185 190 

Val Ala Gly Leu Val Pro Val Glu Pro Ala Ala Ala Ala Ser Arg Ala 

195 200 205 

Gin Val Arg Gly Gly Gly Leu Ala Ala Ala Gly Ala Val Pro Ala Arg 

210 215 220 

Ala Ala 
225 

<210> 88 
<211> 320 
<212> PRT 
<213> Homo sapiens 



<400> t 


i8 












Ala 


Met 


Ala 


Gly 


Leu 


Ala Ala Arg Leu Val Leu 


Leu Ala Gly 


Ala 


Ala 
15 


1 






5 10 








Leu 


Ala 


Ser 


Gly 


Ser Gin Gly Asp Arg Glu 


Pro Val Tyr 


Arg 


Asp Cys 








20 


25 




30 






Val 


Leu 


Gin 


Cys 


Glu Glu Gin Asn Cys Ser 


Gly Gly Ala 




Asn 


His 






35 


40 


45 








Phe 


Arg 


Ser 


Arg 


Gin Pro He Tyr Met Ser 


Leu Ala Gly 


Trp 


Thr 


Cys 




50 






55 


60 








Arg 


Asp 


Asp 


Cys 


Lys Tyr Glu Cys Met Trp 


Val Thr Val 


Gly 


Leu 


Tyr 


65 








70 


75 






80 
Phe 


Leu 


Gin 


Glu 


Gly 


His Lys Val Pro Gin Phe 


His Gly Lys 


Trp 


Pro 








85 ' 90 






95 




Ser 


Arg 


Phe 




Phe Phe Gin Glu Pro Ala 


Ser Ala Val 


Ala 


Ser 


Phe 






100 


105 




110 






Leu 


Asn 


Gly 


Leu 


Ala Ser Leu Val Met Leu 


Cys Arg Tyr 


Arg 


Thr 


Phe 






115 




120 


125 









187 



EP 1 365 034 A2 



Val 


Pro 


Ala 


Ser Ser Pro 


Mat Tyr His 


Thr Cys Val Ala 


Phe Ala Trp 




130 






135 


140 




Val 


Ser 


Leu 


Asn Ala Trp 


Phe Trp Ser 


Thr Val Phe His 


Thr Arg Asp 


145 






150 




155 


160 


Thr 


Asp 


Leu 


Thr Glu Lys 


Met Asp Tyr 


Phe Cys Ala Ser 


Thr Val He 






165 




170 


175 


Leu 


His 


Ser 


lie Tyr Leu 


Cys Cys Val 


Arg Thr Val Gly 


Leu Gin His 








180 


185 




190 


Pro 


Ala 


Val 


Val Ser Ala 


Phe Arg Ala 


Leu Leu Leu Leu 


Met Leu Thr 






195 




200 


205 




Val 


His 


Val 


Ser Tyr Leu 


Ser Leu lie 


Arg Phe Asp Tyr 


Gly Tyr Asn 




210 




215 


220 




Leu 


Val 


Ala 


Asn Val Ala 


He Gly Leu 


Val Asn Val Val 


Trp Trp Leu 


225 




230 


235 


240 


Ala 


Trp 


Cys 


Leu Trp Asn 


Gin Arg Arg 


Leu Pro His Val 


Arg Lys Cys 




245 




250 


255 


Val 


Val 


Val 


Val Leu Leu 


Leu Gin Gly 


Leu Ser Leu Leu 


Glu Leu Leu 






260 


265 




270 


Asp 


Phe 


Pro 


Pro Leu Phe 


Trp Val Leu 


Asp Ala His Ala 


He Trp His 




275 




280 


285 




lie 


Ser 


Thr 


He Pro Val 


His Val Leu 


Phe Phe Ser Phe 


Leu Glu Asp 




290 






295 


300 




Asp 


Ser 


Leu 


Tyr Leu Leu 


Lys Glu Ser 


Glu Asp Lys Phe 


Lys Leu Asp 


305 






310 




315 


320 



<210> 89 



<211> 217 
<212> PRT 
<213> Homo sapiens 



<400> 89 










Gly Leu Ala 


Ala 


Pro 


Pro 


Pro 


Ala Ala 


Ser Gin Gly 


Glu Arg Met Ala 


1 








5 




10 


15 


Ala 


Arg 


Leu 


Val 


Leu Leu 


Ala Gly Ala 


Ala Ala Leu Ala 


Ser Gly Ser 






20 




25 




30 


Gin 


Gly 


Asp 


Arg 


Glu Pro 


Val Tyr Arg 


Asp Cys Val Leu 


Gin Cys Glu 




35 






40 


45 




Glu 


Gin 


Asn 


Cys 


Ser Gly 


Gly Ala Leu 


Asn His Phe Arg 


Ser Arg Gin 




50 






55 


60 




Pro 


He 


Tyr 


Met 


Ser Leu 


Ala Gly Trp 


Thr Cys Arg Asp 


Asp Cys Lys 


65 






70 




75 


80 


Tyr 


Glu 


Cys 


Met 


Trp Val 


Thr Val Gly 


Leu Tyr Leu Gin 


Glu Gly His 








85 




90 


95 




Val 


Pro 


Gin 


Phe His 


Gly Lys Trp 


Pro Phe Ser Arg 


Phe Leu Phe 






100 




105 




110 


Phe 


Gin 


Glu 


Pro 


Ala Ser 


Ala Val Ala 


Ser Phe Leu Asn 


Gly Leu Ala 






115 






120 


125 




Ser 




Val 


Met 


Leu Cys 


Arg Tyr Arg 


Thr Phe Val Pro 


Ala Ser Ser 




130 








135 


140 




Pro 


Met 




His 


Thr Cys 


Val Ala Phe 


Ala Trp Val Ser 


Leu Asn Ala 


145 






150 




155 


160 


Trp 


Phe 


Trp 


Ser 


Thr Val 


Phe His Thr 


Arg Asp Thr Asp 


Leu Thr Glu 






165 




170 


175 



188 
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Lys Met Asp Tyr Phe Cys Ala Ser Thr Val He Leu His Ser He Tyr 

180 185 190 

Leu Cys Cys Val Ser Phe Leu Glu Asp Asp Ser Leu Tyr Leu Leu Lys 

195 200 205 

Glu Ser Glu Asp Lys Phe Lys Leu Asp 
210 215 

<210> 90 

<211> 153 

<212> PRT 

<213> Homo sapiens 



<400> 90 

Met Asn Val Gly Thr Ala His Ser Glu Val Asn Pro Asn Thr Arg Val 

15 10 15 

Met Asn Ser Arg Gly He Trp Leu Ser Tyr Val Leu Ala He Gly Leu 

20 25 30 

Leu His He Val Leu Leu Ser He Pro Phe Val Ser Val Pro Val Val 

35 40 45 

Trp Thr Leu Thr Asn Leu He His Asn Mat Gly Met Tyr He Phe Leu 

50 55 60 

His Thr Val Lys Gly Thr Pro Phe Glu Thr Pro Asp Gin Gly Lys Ala 
65 70 75 80 

Arg Leu Leu Thr His Trp Glu Gin Mat Asp Tyr Gly Val Gin Phe Thr 

85 90 95 

Ala Ser Arg Lys Phe Leu Thr He Thr Pro He Val Leu Tyr Phe Leu 

100 105 HO 

Thr Ser Phe Tyr Thr Lys Tyr Asp Gin He His Phe Val Leu Asn Thr 

115 120 125 

Val Ser Leu Mat Ser Val Leu He Pro Lys Leu Pro Gin Leu His Gly 

130 135 140 

Val Arg He Phe Gly He Asn Lys Tyr 
145 150 

<210> 91 

<211> 436 

<212> PRT 

<213> Homo sapiens 



Met Arg Arg Asp Val Asn Gly Val Thr Lys Ser Arg Phe Glu Mat Phe 

15 10 15 

Ser Asn Ser Asp Glu Ala Val He Asn Lys Lys Leu Pro Lys Glu Leu 

20 25 30 

Leu Leu Arg He Phe Ser Phe Leu Asp Val Val Thr Leu Cys Arg Cys 

35 40 4 5 

Ala Gin Val Ser Arg Ala Trp Asn Val Leu Ala Leu Asp Gly Ser Asn 
50 55 60 
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Trp Gin Arg lie Asp Leu Phe Asp Phe Gin Arg Asp lie Glu Gly Arg 
65 '" 70 75 80 

Val Val Glu Asn lie Ser Lys Arg Cys Gly Gly Phe Leu Arg Lys Leu 

85 90 95 

Sar Leu Arg Gly Cys Leu Gly Val Gly Asp Asn Ala Leu Arg Thr Phe 

100 105 HO 

Ala Gin Asn Cys Arg Asn He Glu Val Leu Asn Leu Asn Gly Cys Thr 

115 120 125 

Lys Thr Thr Asp Ala Thr Cys Thr Ser Leu Ser Lys Phe Cys Ser Lys 

130 135 140 

Leu Arg His Leu Asp Leu Ala Ser Cys Thr Ser He Thr Asn Met Ser 
145 150 155 160 

Leu Lys Ala Leu Ser Glu Gly Cys Pro Leu Leu Glu Gin Leu Asn He 

165 170 175 

Ser Trp Cys Asp Gin Val Thr Lys Asp Gly He Gin Ala Leu Val Arg 

180 185 190 

Gly Cys Gly Gly Leu Lys Ala Leu Phe Leu Lys Gly Cys Thr Gin Leu 

195 200 205 

Glu Asp Glu Ala Leu Lys Tyr He Gly Ala His Cys Pro Glu Leu Val 

210 215 220 

Thr Leu Asn Leu Gin Thr Cys Leu Gin He Thr Asp Glu Gly Leu He 
225 230 235 240 

Thr He Cys Arg Gly Cys His Lys Leu Gin Ser Leu Cys Ala Ser Gly 

245 250 255 

Cys Ser Asn He Thr Asp Ala He Leu Asn Ala Leu Gly Gin Asn Cys 

260 265 270 

Pro Arg Leu Arg He Leu Glu Val Ala Arg Cys Ser Gin Leu Thr Asp 

275 280 285 

Val Gly Phe Thr Thr Leu Ala Arg Asn Cys His Glu Leu Glu Lys Met 

290 295 300 

Asp Leu Glu Glu Cys Val Gin He Thr Asp Ser Thr Leu He Gin Leu 
305 310 315 320 

Ser He His Cys Pro Arg Leu Gin Val Leu Ser Leu Ser His Cys Glu 

325 330 335 

Leu He Thr Asp Asp Gly He Arg His Leu Gly Asn Gly Ala Cys Ala 

340 345 350 

His Asp Gin Leu Glu Val He Glu Leu Asp Asn Cys Pro Leu He Thr 

355 360 365 

Asp Ala Ser Leu Glu His Leu Lys Ser Cys His Ser Leu Glu Arg He 

370 375 380 

Glu Leu Tyr Asp Cys Gin Gin He Thr Arg Ala Gly He Lys Arg Leu 
385 390 395 400 

Arg Thr His Leu Pro Asn He Lys Val His Ala Tyr Phe Ala Pro Val 

405 410 415 

Thr Pro Pro Pro Ser Val Gly Gly Ser Arg Gin Arg Phe Cys Arg Cys 

420 425 430 

Cys He He Leu 
435 

<210> 92 

<211> 204 

<212> PRT 

<213> Homo sapiens 



55 



190 
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<400> 92 








Ala 






Asp 


Pro Lys Asp 


Arg Lys Lys He 


Gin Phe 


Ser Val Pro 


Pro 


1 




5 




10 




15 






Ser 


Gin Leu Asp 


Pro Arg Gin Val 


Glu Met 


He Arg Arg 


Arg 


Arg 






20 


25 




30 




Glu 


Pro 


Thr 


Pro Ala Met 


Leu Phe Arg Leu 


Ser Glu 


His Ser Ser 


Pro 






35 


40 




45 






Glu 


Glu 


Ala Ser Pro 


His Gin Arg Ala 


Ser Gly 


Glu Gly His 


His 


Leu 




50 




55 




60 






Lys 


Ser 


Lys Arg Pro 


Asn Pro Cys Ala 


Tyr Thr 


Pro Pro Ser 


Leu 


Lys 


65 






70 


75 






80 


Ala 


Val 


Gin Arg He 


Ala Glu Ser His 


Leu Gin 


Ser He Ser 


Asn 


Leu 






85 




90 




95 






Glu 


Asn Gin Ala 


Ser Glu Glu Glu 


Asp Glu 


Leu Gly Glu 


Leu 


Arg 






100 


105 




110 






Glu 


Leu 


Gly Tyr Pro 


Arg Glu Glu Asp 


Glu Glu 


Glu Glu Glu 


Asp 


Asp 






115 


120 




125 




Val 


Glu 


Glu 


Glu Glu Glu 


Glu Glu Asp Ser 


Gin Ala 


Glu Val Leu 


Lys 




130 




135 




140 






He 


Arg 


Gin Ser Ala 


Gly Gin Lys Thr 


Thr Cys 


Gly Gin Gly 


Leu 


145 




150 


155 






160 


Gly 


Pro 


Trp Glu Arg 


Pro Pro Pro Leu 


Asp Glu 


Ser Glu Arg 


Asp 


Gly 




165 




170 




175 




Gly 


Sar 


Glu Asp Gin 


Val Glu Asp Pro 


Ala Leu 


Ser Glu Pro 


Gly 


Glu 




180 


185 




190 






Glu 


Pro 


Gin Arg Pro 


Ser Pro Ser Glu 


Pro Gly 












195 


200 











<210> 93 

<211> 115 

<212> PRT 
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<400> £ 


)3 




Met 


Ser 


Gly Glu Pro 


Gly Gin Thr 


1 




5 




Val 


Glu 


Pro Gly Ser 


Gly Val Arg 






20 




Cys 


Gly 


Phe Glu Ala 


Thr Tyr Leu 






35 


40 


Gin 


Tyr 


Pro Gly He 


Glu He Glu 




50 




55 


Phe 


Glu 


He Glu He 


Asn Gly Gin 


65 






70 


Gly 


Gly 


Phe Pro Tyr 
85 


Glu Lys Asp 


Ser 


Asn 


Gly Glu Thr 


Leu Glu Lys 






100 




Val 


He 


Leu 








115 





<210> 94 



Ser Val Ala Pro Pro Pro Glu Glu 

10 15 
He Val Val Glu Tyr Cys Glu Pro 
25 30 
Glu Leu Ala Ser Ala Val Lys Glu 
45 

Ser Arg Leu Gly Gly Thr Gly Ala 
60 

Leu Val Phe Ser Lys Leu Glu Asn 

75 80 
Leu He Glu Ala He Arg Arg Ala 

90 95 
He Thr Asn Ser Arg Pro Pro Cys 
105 HO 



191 
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<211> 144 
<212> PRT 
<213> Homo sapiens 



<400> 94 






Phe Leu lie 


Met 


Gly 


Ala Val Val 


Leu Cys Arg Pro 


Ser Pro Leu Asn 


1 


5 




10 


15 


Gin 


Thr 


Gly Thr Gly 


Gin Gly Leu Ser 


Cys Gly Ser His 


Met Trp Arg 






20 


25 




30 


Cys 


Glu 


Ala Thr Pro 


Cys Gly Val Cys 


Gly Glu Ser Pro 


Val Gly Ser 




35 


40 


45 




Leu 


Leu 


Lys Gin His 


Arg Gly Arg Gly 


Lys Thr Trp Pro 


Val Gly Thr 




50 


55 


60 


Ser Leu Gly 


Val 


Ser 


Ala Cys Arg 


Glu Glu Ser Glu 


Ala Gly Ser Leu 


65 




70 


75 


80 


Trp 


Ser 


Leu Leu Pro 


Ser Pro Val Gly 


Leu Gly Ala Val 


Leu lie Leu 




85 




90 


95 


Lys 


Arg 


Cys Gly Ser 


Leu Cys Pro Leu 


Pro Gly Val Gin 


Gly Asn Arg 


100 


105 




110 


Arg 


Gly 


His Trp Ala 


Cys Phe Leu Pro 


Pro Asp Pro Ala 


Ser Pro Thr 




115 


120 


125 


Ser Lys Val 


Pro 


Cys 


lie lie Gly 


Asn Phe His Leu 


Lys lie Phe Leu 




130 




135 


140 





<210> 95 
<211> 425 
<212> PRT 



<213> Homo sapiens 



<400> 95 










Met 


Gly 


Gly Gly 


Asp 


Leu Asn Leu Lys 


Lys Ser Trp His 


Pro Gin Thr 


1 


5 




10 


15 


Leu 


Arg 


Asn Val 


Glu 


Lys Val Trp Lys 


Ala Glu Gin Lys 


His Glu Ala 




20 




25 




30 


Glu 


Arg 


Lys Lys 


He 


Glu Glu Leu Gin 


Arg Glu Leu Arg 


Glu Glu Arg 




35 




40 


45 




Ala 


Arg 


Glu Glu 


Met 


Gin Arg Tyr Ala 


Glu Asp Val Gly 


Ala Val Lys 




50 






55 


60 




Lys 




Glu Glu 


Lys 


Leu Asp Trp Met 


Tyr Gin Gly Pro 


Gly Gly Met 


65 




70 


75 


80 


Val 


Asn 


Arg Asp 


Glu 


Tyr Leu Leu Gly 


Arg Pro He Asp 


Lys Tyr Val 






85 




90 


95 


Phe 


Glu 


Lys Met 


Glu 


Glu Lys Glu Ala 


Gly Cys Ser Ser 


Glu Thr Gly 






100 




105 




110 




Leu 


Pro Gly 


Ser 


He Phe Ala Pro 


Ser Gly Ala Asn 


Ser Leu Leu 






115 




120 


125 




Asp 


Met 


Ala Ser 


Lys 


He Arg Glu Asp 


Pro Leu Phe He 


He Arg Lys 


130 






135 


140 




Lys 


Glu 


Glu Glu 




Lys Arg Glu Val 


Leu Asn Asn Pro 


Val Lys Met 


145 






150 


155 


160 



192 
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Lys Lys 


lie 


Lys 


Glu 


Leu Leu Gin Met 


Ser Leu Glu Lys 


Lys Glu 


Lys 




165 




170 


175 




Lys Lys 


Lys 


Lys 


Glu 


Lys Lys Lys Lys 


His Lys Lys His 


Lys His 


Arg 




180 




185 




190 




Ser Ser 


Ser 


Ser 


Asp 


Arg Ser Ser Ser 


Glu Asp Glu His 


Ser Ala 


Gly 




195 






200 


205 






Arg Ser 


Gin 


Lys 


Lys 


Met Ala Asn Ser 


Ser Pro Val Leu 


Ser Lys 


Val 


210 








215 


220 






Pro Gly 


Tyr 


Gly 


Leu 


Gin Val Arg Asn 


Ser Asp Arg Asn 


Gin Gly 


Leu 


225 






230 


235 




240 


Gin Gly 


Pro 


Leu 


Thr 


Ala Glu Gin Lys 


Arg Gly His Gly 


Met Lys 


Asn 






245 




250 


255 




His Ser 


Arg 


Ser 


Arg 


Ser Ser Ser His 


Ser Pro Pro Arg 


His Ala 


Ser 




260 


265 




270 




Lys Lys 


Ser 


Thr 


Arg 


Glu Ala Gly Ser 


Arg Asp Arg Arg 


Ser Arg 


Ser 


275 






280 


285 






Leu Gly 


Arg 


Arg 


Ser 


Arg Ser Pro Arg 


Pro Ser Lys Leu 


His Asn 


Ser 


290 








295 


300 






Lys Val 


Asn 


Arg 


Arg 


Glu Thr Gly Gin 


Thr Arg Ser Pro 


Ser Pro 


Lys 


305 




310 


315 




320 


Lys Glu 


Val 


Tyr 


Gin 


Arg Arg His Ala 


Pro Gly Tyr Thr 


Arg Lys 


Leu 




325 




330 


1 As 5 




Ser Ala 


Glu 


Glu 


Leu 


Glu Arg Lys Arg 


Gin Glu Met Met 




Ala 






340 




345 




350 




Lys Trp 


Arg 


Glu 


Glu 


Glu Arg Leu Asn 


He Leu Lys Arg 


His Ala 


Lys 


355 






360 


365 






As Glu 




Arg 


Glu 


Gin Arg Leu Glu 


Lys Leu Asp Ser 


Arg Asp 


Gly 


370 








375 


380 






Lys Phe 


He 




Arg 


Met Lys Leu Glu 


Ser Ala Ser Thr 


Ser Ser 


Leu 


385 








390 


395 




400 


Glu Asp 


Arg 


Val 




Arg Asn He Tyr 


Ser Leu Gin Arg 


Thr Ser 
415 


Val 






405 




410 




Ala Leu 


Glu 




Asn 


Phe Met Lys Arg 












420 




425 








<210> 96 














<211> 394 














<212> PRT 














<213> Homo 


sap: 












<400> 96 














Met Phe 


Ser 


Val 


Phe 


Glu Glu He Thr 


Arg He Val Val 


Lys Glu 


Met 


1 






5 




10 


15 




Asp Ala 


Gly 


Gly 


Asp 


Met He Ala Val 


Arg Ser Leu Val 


Asp Ala 


Asp 




20 




25 




30 




Arg Phe 


Arg 
35 


Cys 


Phe 


His Leu Val Gly 


Glu Lys Arg Thr 


Phe Phe 


Gly 




40 


45 






Cys Arg 


His 


Tyr 


Thr 


Thr Gly Leu Thr 


Leu Met Asp He 


Leu Asp 


Thr 


50 








55 


60 






His Gly 


Asp 


Lys 


Trp 


Leu Asp Glu Leu 


Asp Ser Gly Leu 


Gin Gly 


Gin 


65 








70 


75 




80 


Lys Ala 


Glu 


Phe 


Gin 


He Leu Asp Asn 


Val Asp Ser Thr 


Gly Glu 


Leu 






85 




90 


95 
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lie Val 


Arg 


Leu Pro Lys Glu 


He Thr 


He Ser Gly 


Ser 


Phe 


Gin 


Gly 




100 


105 






110 






Phe His 


His 


Gin Lys lie Lys 


He Ser 


Glu Asn Arg 


He 


Ser 


Gin 


Gin 




115 


120 




125 








Tyr Leu 


Ala 


Thr Leu Glu Asn 


Arg Lys 


Leu Lys Arg 


Glu 


Leu 


Pro 


Phe 


130 




135 




140 










Ser Phe 


Arg 


Ser lie Asn Thr 


Arg Glu 


Asn Leu Tyr 


Leu 


Val 


Thr 


Glu 


145 


150 




155 








160 


Thr Leu 


Glu 


Thr Val Lys Glu 


Glu Thr 


Leu Lys Ser 


Asp 


Arg 


Gin 


Tyr 




165 




170 






175 




Lys Phe 


Trp 


Ser Gin He Ser 


Gin Gly 


His Leu Ser 


Tyr 


Lys 


His 


Lys 


180 


185 






190 






Gly Gin 


Arg 


Glu Val Thr He 


Pro Pro 


Asn Arg Val 


Leu 


Ser 


Tyr 


Arg 


195 




200 




205 








Val Lys 


Gin 


Leu Val Phe Pro 


Asn Lys 


Glu Thr Met 


Arg 


Lys 


Ser 


Leu 


210 




215 




220 










Gly Ser 


Glu 


Asp Ser Arg Asn 


Met Lys 


Glu Lys Leu 


Glu 


Asp 


Met 


Glu 


225 




230 




235 








240 


Ser Val 


Leu 


Lys Asp Leu Thr 


Glu Glu 


Lys Arg Lys 


Asp 


Val 


Leu 


Asn 






245 




250 






255 




Ser Leu 


Ala 


Lys Cys Leu Gly 


Lys Glu 


Asp He Arg 


Gin 


Asp 


Leu 


Glu 






260 


265 






270 






Gin Arg 


Val 


Ser Glu Val Leu 


He Ser 


Gly Glu Leu 


His 


Met 


Glu 


Asp 


275 




280 




285 








Pro Asp 


Lys 


Pro Leu Leu Ser 


Ser Leu 


Phe Asn Ala 


Ala 


Gly 


Val 


Leu 


290 


295 




300 










Val Glu 


Ala 


Arg Ala Lys Ala 


He Leu 


Asp Phe Leu 


Asp 


Ala 


Leu 


Leu 


305 




310 




315 








3 


Glu Leu 


Ser 


Glu Glu Gin Gin 


Phe Val 


Ala Glu Ala 


Leu 


Glu 


Lys 


Gly 






325 




330 










Thr Leu 


Pro 


Leu Leu Lys Asp 


Gin Val 


Lys Ser Val 


Met 


Glu 


Gin 


Asn 






340 


345 






350 






Trp Asp 


Glu 


Leu Ala Sex Ser 


Pro Pro 


Asp Met Asp 


Tyr 


Asp 


Pro 


Glu 


355 




360 




365 








Ala Arg 


He 


Leu Cys Ala Leu 


Tyr Val 


Val Val Ser 


He 


Leu 


Leu 


Glu 


370 




375 




380 










Leu Ala 


Glu 


Gly Pro Thr Ser 


Val Ser 


Ser 










385 




390 














<210> 97 
















<211> 456 
















<212> PRT 
















<213> 1 


iomo 


sapiens 














<400> 97 
















Met Glu Gly 


Pro Glu Gly Leu 


Gly Arg 


Lys Gin Ala 


Cys 


Leu 


Ala 


Met 


1 




5 




10 






15 




Leu Leu 


His 


Phe Leu Asp Thr 


Tyr Gin 


Gly Leu Leu 


Gin 


Glu 


Glu 


Glu 






20 


25 






30 






Gly Ala Gly 


His He He Lys 


Asp Leu 


Tyr Leu Leu 


He 


Met 


Lys 


Asp 




35 




40 




45 








Glu Ser 


Leu 


Tyr Gin Gly Leu 


Arg Glu 


Asp Thr Leu 


Arg 


Leu 


His 


Gin 



50 55 60 
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Leu 


Val Glu Thr 


Val Glu Leu 


Lys 


He Pro Glu 


Glu Asn 


Gin 


Pro 


Pro 


65 




70 




75 








80 


Ser 


Lys Gin Val 


Lys Pro Leu 


Phe 


Arg His Phe 


Arg Arg 


He 


Asp 


Ser 




85 




90 






95 




Cys 


Leu Gin Thr 


Arg Val Ala 


Phe 


Arg Gly Ser 


Asp Glu 


He 


Phe 


Cys 


100 




105 




110 






Arg 


Val Tyr Met 


Pro Asp His 


Ser 


Tyr Val Thr 


He Arg 


Ser 


Arg 


Leu 


115 




120 




125 








Ser 


Ala Ser Val 


Gin Asp lie 


Leu 


Gly Ser Val 


Thr Glu 


Lys 


Leu 


Gin 




130 


135 






140 








Tyr 


Ser Glu Glu 


Pro Ala Gly 


Arg 


Glu Asp Ser 


Leu He 


Leu 


Val 


Ala 


145 




150 




155 








160 


Val 


Ser Ser Ser 


Gly Glu Lys 
165 


Val 


Leu Leu Gin 
170 


Pro Thr 


Glu 


Asp 
175 


Cys 


Val 


Phe Thr Ala 
180 


Leu Gly He 


Asn 


Ser His Leu 

185 


Phe Ala 


Cys 
190 


Thr 


Arg 


Asp 


Ser Tyr Glu 


Ala Leu Val 


Pro 


Leu Pro Glu 


Glu He 


Gin 


Val 


Ser 


195 




200 




205 








Pro 


Gly Asp Thr 
210 


Glu He His 
215 


Arg 


Val Glu Pro 


Glu Asp 
220 


Val 


Ala 


Asn 


His 


Leu Thr Ala 


Phe His Trp 


Glu 


Leu Phe Arg 


Cys Val 


His 


Glu 


Leu 


225 




230 




235 








Th° 


Glu 


Phe Val Asp 


Tyr Val Phe 


His 


Gly Glu Arg 


Gly Arg 


Arg 


Glu 






245 




250 






255 




Ala 


Asn Leu Glu 


Leu Leu Leu 


Gin 


Arg Cys Ser 


Glu Val 


Thr 


His 


Trp 




260 






265 




270 






Val 


Ala Thr Glu 
275 


Val Leu Leu 


Cys 
280 


Glu Ala Pro 


Gly Lys 
285 


Arg 


Ala 


Gin 


Leu 


Leu Lys Lys 


Phe He Lys 


He 


Ala Ala Leu 


Cys Lys 


Gin 


Asn 


Gin 




290 


295 






300 








Asp 


Leu Leu Ser 


Phe Tyr Ala 


Val 


Val Met Gly 


Leu Asp 


Asn 


Ala 


Ala 
320 


305 




310 




315 








Val 


Ser Arg Leu 


Arg Leu Thr 


Trp 


Glu Lys Leu 


Pro Gly 


Lys 


Phe 


Lys 




325 




330 






335 




Asn 


Leu Phe Arg 


Lys Phe Glu 


Asn 


Leu Thr Asp 


Pro Cys 


Arg 


Asn 


His 




340 




345 




350 






Lys 


Ser Tyr Arg 


Glu Val He 


Ser 


Lys Met Lys 


Pro Pro 


Val 


He 


Pro 


355 




360 




365 








Phe 


Val Pro Leu 

370 


He Leu Lys 
375 


Asp 


Leu Thr Phe 


Leu His 

380 


Glu 


Gly 


Ser 


Lys 


Thr Leu Val 


Asp Gly Leu 


Val 


Asn He Glu 


Lys Leu 


His 


Ser 




385 




390 




395 








400 


Ala 


Glu Lys Val 


Arg Thr He 


Arg 


Lys Tyr Arg 












405 




410 






415 




Leu 


Asp Met Glu 


Ala Ser Pro 


Asn 


His Leu Gin 


Thr Lys 


Ala 


Tyr 


Val 




420 






425 




430 






Arg 


Gin Phe Gin 


Val He Asp 


Asn 


Gin Asn Leu 


Leu Phe 


Glu 


Leu 


Ser 


435 




440 




445 








Tyr 


Lys Leu Glu 
450 


Ala Asn Ser 
455 


Gin 













<210> 98 

<211> 715 

<212> PRT 

<213> Homo sapiens 
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<400> 98 
















Mat 


Ser 


Gin Val Met Ser Ser 


Pro 


Leu 


Leu Ala 


Gly Gly 


His 


Ala 


Val 


1 




5 






10 






15 




Ser 


Leu 


Ala Pro Cys Asp Glu 


Pro 


Arg 


Arg Thr 


Leu His 


Pro 


Ala 


Pro 






20 




25 






30 






Sar 


Pro 


Ser Leu Pro Pro Gin 


Cys 


Ser 


Tyr Tyr 


Thr Thr 


Glu 


Gly 


Trp 






35 


40 






45 








Gly 


Ala 


Gin Ala Leu Met Ala 


Pro 


Val 


Pro Cys 


Met Gly 


Pro 


Pro 


Gly 


50 


55 








60 








Arg 


Leu 


Gin Gin Ala Pro Gin 


Val 


Glu 


Ala Lys 


Ala Thr 


Cys 


Phe 


Leu 


65 




70 






75 








80 


Pro 


Ser 


Pro Gly Glu Lys Ala 


Leu 


Gly 


Thr Pro 


Glu Asp 


Leu 


Asp 


Ser 






85 






90 






95 




Tyr 


He 


Asp Phe Ser Leu Glu 


Sar 


Leu 


Asn Gin 


Met He 


Leu 


Glu 


Leu 




100 




105 






110 






Asp 


Pro 


Thr Phe Gin Leu Leu 


Pro 


Pro 


Gly Thr 


Gly Gly 


Ser 


Gin 


Ala 




115 


120 






125 








Glu 


Leu 


Ala Gin Ser Thr Met 


Ser 


Met 


Arg Lys 


Lys Glu 


Glu 


Ser 


Glu 




130 


135 








140 








Ala 


Leu 


Asp Ha Lys Tyr He 


Glu 


Val 


Thr Ser 


Ala Arg 


Ser 


Arg 


Cys 


145 




150 






155 








160 


His 


Asp 


Trp Pro Gin His Cys 


Ser 


Ser 


Pro Ser 


Val Thr 


Pro 


Pro 


Phe 




165 






170 






175 




Gly 


Ser 


Pro Arg Ser Gly Gly 


Leu 


Leu 


Leu Ser 


Arg Asp 


Val 


Pro 


Arg 




180 




185 






190 






Glu 


Thr 


Arg Ser Ser Ser Glu 


Ser 


Leu 


He Phe 


Ser Gly 


Asn 


Gin 


Gly 






195 


200 






205 








Arg 


Gly 


His Gin Arg Pro Leu 


Pro 


Pro 


Sar Glu 


Gly Leu 


Ser 


Pro 


Arg 


210 


215 








220 








Pro 


Pro 


Asn Ser Pro Ser He 


Ser 


He 


Pro Cys 


Met Gly 


Ser 


Lys 


Ala 


225 




230 






235 








240 


Ser 


Ser 


Pro His Gly Leu Gly 


Ser 


Pro 


Leu Val 


Ala Ser 


Pro 


Arg 


Leu 






245 






250 






255 




Glu 


Lys 


Arg Leu Gly Gly Leu 


Ala 


Pro 


Gin Arg 


Gly Ser 


Arg 


He 


Ser 




260 




265 






270 






Val 


Leu 


Ser Ala Ser Pro Val 


Ser 


Asp 


Val Ser 


Tyr Met 


Phe 


Gly 


Ser 






275 


280 






285 








Ser 


Gin 


Ser Leu Leu His Ser 


Ser 


Asn 


Ser Ser 


His Gin 


Ser 


Ser 


Ser 




290 


295 








300 








Arg 


Ser 


Leu Glu Ser Pro Ala 


Asn 


Ser 


Ser Ser 


Ser Leu 


His 


Ser 


Leu 


305 




310 






315 








320 


Gly 


Ser 


Val Ser Leu Cys Thr 


Arg 


Pro 


Ser Asp 


Phe Gin 


Ala 


Pro 


Arg 




325 






330 






335 




Asn 


Pro 


Thr Leu Thr Met Gly 


Gin 


Pro 


Arg Thr 


Pro His 


Ser 


Pro 


Pro 






340 




345 






350 






Leu 


Ala 


Lys Glu His Ala Ser 


He 


Cys 


Pro Pro 


Ser He 


Thr 


Asn 


Ser 






355 


360 






365 








Met 


Val 


Asp He Pro He Val 


















370 


375 








380 








Ser 


Ser 


Pro Pro Gin Arg Thr 


Pro 


Gly 


His Gin 


Asn Ser 


Val 


Gin 


Pro 


385 




390 






395 








400 


Gly 


Ala 


Ala Ser Pro Ser Asn 


Pro 


Cys 


Pro Ala 


Thr Arg 


Ser 


Asn 


Ser 




405 






410 






415 




Gin 


Thr 


Leu Ser Asp Ala Pro 


Phe 


Thr 


Thr Cys 


Pro Glu 


Gly 


Pro 


Ala 






420 




425 






430 






Arg 


Asp 


Met Gin Pro Thr Met 


Lys 


Phe 


Val Met 


Asp Thr 


Ser 


Lys 


Tyr 


435 


440 






445 
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Trp 


Phe 


Lys 


Pro Asn He Thr 


Arg una 




U 


Leu 




Arg 


450 


455 




460 










Lys 


Glu 


Glu 


Pro Gly Ala Phe 


V S 9 475 


r 
er 


Ser 
er 


Ser 


Tyr 


Arg 


465 


















480 


Gly 


Sex 


Phe 


Gly Leu Ala Leu 


Lys Val Gin Glu 


V 1 
a 




Al 


Ser 








485 










495 




Gin 


Asn 


Arg 


Pro Gly Glu Asp 


Ser Asn Asp Leu 




9 




Phe 


Leu 
U 














510 






lie 


Glu 


Ser 


Ser Ala Lys Gly 


Val His Leu Lys 


y 




P 


Glu 
U 


Glu 






515 




5 ?° ™. ,. , _ 




525 








Pro 


Tyr 


Phe 


Gly Ser Leu Ser 


Ala Phe Val Cys 




H" 
is 


er 


He 


Met 




530 




535 




540 










Ala 


Leu 




Leu Pro Cys Lys 


Leu Thr He Pro 


n 




Gl 

U 


Leu 
eu 


Gl 


545 






550 












560 


Gly 


Ala 


Asp 


Gly Ala Ser Asp 


Ser Thr Asp Ser 


ro 


a 


er 




Gin 






565 


T m7° 








575 




Lys 


Lys 


Ser 


Ala Gly Cys His 


Thr Leu Tyr Leu 


er 


Ser 
er 


Val 


Ser 


Val 




580 








590 






Glu 


Thr 


Leu 


Thr Gly Ala Leu 


Ala Val Gin Lys 






er 


Th 

r 


Thr 






595 








605 








Phe 


Glu 


Arg 


Asp He Leu Pro 


Thr Pro Thr Val 


a 


is 


e 


U 


V 1 




610 


615 














Thr 


Glu 


Gin 


Gly He Thr Leu 


Thr Asp Val Gin 


g 


ys 


Val 


Phe 


Phe 


625 






in, 












640 


Arg 


Arg 


His 


Tyr Pro Leu Thr 


Phe 


-ys 


Gl 


Met 


Asp 


Pro 






r ^ 650 








655 




Glu 


Gin 


Arg 


Lys Trp Gin Lys 


Tyr Cys Lys Pro 


Ser 


Trp 


He 


Phe 


Gly 








660 


665 






670 






Phe 


Val 


Ala 


Lys Ser Gin Thr 


Glu Pro Gin Glu 


Asn 


Val 


Cys 


His 


Leu 






675 


680 




685 








Phe 


Ala 


Glu 


Tyr Asp Met Val 


Gin Pro Ala Ser 


Gin 


Val 


He 


Gly 


Leu 




690 




695 




700 










Val 


Thr 


Ala 


Leu Leu Gin Asp 


Ala Glu Arg Met 












705 






710 


715 












<210> 99 
















<211> 35 
















<212> r 


)NA 
















<213> Artificial sequence 















45 <220> 

<223> PCR primer 

<400> 99 

ccatatataa aaccactgtc ctgtcctttg tggct 

so <210> 100 

<211> 26 

<212> DNA 

55 

<213> Artificial sequence 
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<223> PCR primer 
<400> 100 

cccccatotg totgtctata tttgtc 



<210> 101 

<211> 22 

<212> DNA 

<213> Artificial i 



<223> PCR primer 

<400> 101 

tgcotacgot gaegaetatg tg 

<210> 102 

<211> 25 

<212> DNA, 

<213> Artificial sequence 



<223> PCR primer 

<400> 102 

tttggttttc tacaactgtt gctat 

<210> 103 

<211> 19 

<212> DNA 

<213> Artificial sequence 



<220> 

<223> PCR primer 
<400> 103 

gggctccaca caccagatg 
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<210> 104 

<211> 21 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 

<400> 104 

acgctctgag caccctctac a 

<210> 105 

<211> 31 

<212> DNA 

<213> Artificial 



<223> PCR primer 

<400> 105 

tgtcacaggg actgaaaacc tctcctcatg t 

<210> 106 

<211> 17 

<212> DNA 

<213> Artificial sequence 



<220> 

<223> PCR primer 

<400> 106 
cceaaggcca cgagctt 

<210> 107 

<211> 24 

<212> DKA 

<213> Artificial sequence 
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<223> PCR primer 

<400> 107 

tgttgctctc ttaaogaatc gaaa 

<210> 108 

<211> 29 

<212> DNA 

<213> Artificial 



<223> PCR primer 

<400> 108 

ctggtcaaac aaactatctg 

<210> 109 

<211> 20 

<212> DNA 

<213> Artificial seqi 



<223> PCR primer 

<400> 109 

tggtgaggaa aagcggacat 

<210> 110 

<211> 21 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 
<400> 110 

ctggcttgga ggacagtgaa g 
<210> 111 
<211> 24 
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<212> DNA 

<213> Artificial sequence 



<223> PCR primer 

<400> 111 

ccaagccctc cccatcccat gtat 

<210> 112 

<211> 21 

<212> DNA 

<213> Artificial 



<220> 

<223> PCR primer 

25 <400> 112 

gaggtgtcgt accgcgttct a 

<210> 113 

30 <211> 21 

<212> DNA 

<213> Artificial sequence 

35 

<220> 

<223> PCR primer 

<400> 113 

ccgttctgct cttccctgtc t 

<210> 114 

<211> 23 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 
<400> 114 

ccagacccgc ttcactgacc tgc 
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<210> 115 

<211> 20 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 

<400> 115 

cgcctgtact tcagcatgga 

<210> 116 

<211> 18 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 

<400> 116 
gcggttcagc tggtggaa 

<210> 117 

<211> 25 

<212> DNA 

<213> Artificial 



<223> PCR primer 

<400> 117 

accccgaggc atcaccacaa atcat 

<210> 118 

<211> 23 

<212> DNA 

<213> Artificial sequence 



202 
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<223> PCR primer 

<400> 118 

agttctgcct ctctgacaac cat 

<210> 119 

<211> 23 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 

<400> 119 

taggctcaga gteagaccca aac 

<210> 120 

<211> 21 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 

<400> 120 

ccctcgtggg cttgtgctcg g 

<210> 121 

<211> 21 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 
<400> 121 

aagccgccag ttcatctttt t 
<210> 122 
<211> 25 



203 



EP 1 365 034 A2 



<212> DNA 

<213> Artificial sequence 



<223> PCR primer 

<400> 122 

cttgtggttc aagtcaaatg ttcag 

<210> 123 

<211> 21 

<212> DNA 

<213> Artificial 



<223> PCR primer 

<400> 123 

tctgcctgcg ctctcgtcgg t 

<210> 124 

<211> 18 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 

<400> 124 
gggctgggca cctgactt 

<210> 125 

<211> 20 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 
<400> 125 

cccaacaagg gtcccagact 



204 
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<210> 126 

5 <211> 17 

<212> DNA 

<213> Artificial sequence 

10 



<220> 



<223> PCR primer 
<400> 126 
cggcgcattg agcggcg 



<211> 20 
<212> DHA 



<213> Artificial sequence 



25 



<220> 

<223> PCR primer 

<400> 127 

cccaagggac ttcgtgaatg 

<210> 128 

<211> 21 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 

<400> 128 

ggcgatccct gatgacaagt a 

<210> 129 

<211> 29 

<212> DNA 

<213> Artificial sequence 



55 
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<223> PCR primer* 

<400> 129 

agcaccaact gtgaacoagg tacaatggc 

<210> 130 

<211> 19 

<212> DNA 

<213> Artificial s« 



<223> PCR primer 

<400> 130 

gagggaggct ctgctttgg 

<210> 131 

<211> 21 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 

<400> 131 

tcacaactag cgggtgagga g 

<210> 132 

<211> 21 

<212> DNA 

<213> Artificial sequenc 



<223> PCR primer 
<400> 132 

tgcagaggaa cggcgtgagc g 
<210> 133 
<211> 22 



206 



EP 1 365 034 A2 



<212> DKA 

<213> Artificial sequence 



<223> PCR primer 

<400> 133 

tgaggtttcc tcccaaatcg ta 

<210> 134 

<211> 22 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 

<400> 134 

cagctcaagg gaagctgtca tc 

<210> 135 

<211> 24 

<212> DMA 

<213> Artificial 



<223> PCR primer 

<400> 135 

cccccacatg ttccccaaga tgct 

<210> 136 

<211> 21 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 
<400> 136 

ggaggcgcta aaggtctacg t 



207 



EP 1 365 034 A2 



<210> 137 

<211> 21 

<212> DKA 

<213> Artificial sequence 



<223> PCR primer 

<400> 137 

tgatgcttcg caggtcagta t 

<210> 138 

<211> 26 

<212> DNA 

<213> Artificial 



<223> PCR primer 

<400> 138 

ctcctgcccc tcctaaagct gaagcc 

<210> 139 

<211> 17 

<212> DNA 

<213> Artificial 



<223> PCR primer 

<400> 139 
ggacgcgtgg gcttttc 

<210> 140 

<211> 20 

<212> DNA 

<213> Artificial sequence 



208 
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<220> 



<223> PCR primer 
<400> 140 

tgtggctgtg gacacctttc 



<210> 141 



<211> 25 
<212> DNA 



<213> Artificial sequence 



15 



<220> 



<223> PCR primer 
<400> 141 

ccacaagctg aaggcagaca aggcc 
<210> 142 



<211> 20 
<212> DNA 



<213> Artificial sequence 



<223> PCR primer 

<400> 142 

gcggattctc atggaacaca 

<210> 143 

<211> 20 

<212> DNA 

<213> Artificial sequence 



<220> 

<223> PCR primer 
<400> 143 

ggtcagccag gagcttcttg 
<210> 144 
<211> 23 



209 
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<212> DNA 

<213> Artificial sequence 



<223> PCR primer 

<400> 144 

accaccttgc gcaggttgtc cag 

<210> 145 

<211> 18 

<212> DKA 

<213> Artificial 



<210> 146 

<211> 23 

<212> DKA 

<213> Artificial sequence 



<223> PCR primer 

<400> 146 

gtctcgatct tggacagctt ctg 

<210> 147 

<211> 22 

<212> DNA 

<213> Artificial sequence 



<220> 

<223> PCR primer 
<400> 147 

acactgtcca cacggcccga gg 



210 
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<210> 148 

<211> 21 

<212> DHA 

<213> Artificial sequence 



<223> PCR primer 

<400> 148 

etgggcagaa tggaaggatc t 

<210> 149 

<211> 22 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 

<400> 149 

gggactctag cagacccaca ct 

<210> 150 

<211> 22 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 

<400> 150 

cacecacctg gattccctgt tc 

<210> 151 

<211> 23 

<212> DNA 

<213> Artificial sequence 



211 
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<223> PCR primer 

<400> 151 

ccttcagaca ggcgtagatg atg 

<210> 152 

<211> 29 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 

<400> 152 

gggtattatt tctttattag gtgccaett 

<210> 153 

<211> 30 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 

<400> 153 

ttccctaagg ctttcagtac ccaggatctg 

<210> 154 

<211> 18 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 
<400> 154 
ccagcttggc cctttcct 

<210> 155 

<211> 23 



212 
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<212> DNA 

<213> Artificial sequence 



<223> PCR primer 
<400> 155 

gaatgggtcg cttttgttct tag 



<210> 156 

<211> 22 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 

<400> 156 

tcacggacct cagcctgccc ct 

<210> 157 

<211> 21 

<212> DNA 

<213> Artificial 



<223> PCR primer 

<400> 157 

tggtgaaggt gtcagccatg t 

<210> 158 

<211> 21 

<212> DNA 

<213> Artificial sequence 



55 
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<220> 

<223> PCR primer 

<400> 158 

tcagagtgca gcaatggctt t 

<210> 159 

<211> 20 

<212> DNA 

<213> Artificial sequence 



<220> 

<223> PCR primer 

20 <400> 159 

acctccttcc ccagctcccc 

<210> 160 

<211> 24 

25 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 

<400> 160 

ggcaacatct tacttgtcct ttga 

<210> 161 

<211> 25 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 
<400> 161 

ccaaggaagc acagacaact atttc 
<210> 162 
<211> 30 



214 
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<212> DNA 

<213> Artificial sequanoa 



<220> 

10 <223> PCR primer 

<400> 162 

tcctocctat ccatggcact aaacoacttc 

<210> 163 

15 <211> 19 

<212> DNA 

<213> Artificial sequence 



<220> 

<223> PCR primer 

25 <400> 163 

tgggcaaggg ctcctatct 

<210> 164 

3fl <211> 21 

<212> DNA 

<213> Artificial sequence 

35 

<220> 

<223> PCR primer 

<400> 164 
40 gttacccctg gcagacgtat g 

<210> 165 

<211> 31 

<212> DNA 

<213> Artificial sequence 



<220> 

<223> PCR primer 
<400> 165 

tgcctctgag tctgaatctc ccaaagagag a 
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<210> 166 

<211> 31 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 

<400> 166 

gagtagttat gtgattattt cagctcttga c 

<210> 167 

<211> 21 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 

<400> 167 

tcaaatgttg tccccgagtc t 

<210> 168 

<211> 34 

<212> DNA 

<213> Artificial 



<223> PCR primer 

<400> 168 

cagaaattcg gaagacagaa ctattgtcat gcct 

<210> 169 

<211> 27 

<212> DNA 

<213> Artificial sequence 



216 
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<223> PCR primar 

<400> 169 

gattagtaac ccatagcagt tgaaggt 

<210> 170 

<211> 26 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 

<400> 170 

atttactgaa ggtggtctga acatac 

<210> 171 

<211> 31 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 

<400> 171 

tgacagactc caaatcacaa gcacagtcaa < 

<210> 172 

<211> 25 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 
<400> 172 

tgatggtttg gaggaaagtt tattt 
<210> 173 
<211> 24 



217 
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<212> DNA 

<213> Artificial sequence 



<223> PCR primer 

<400> 173 

tttggttggg tctttagagg aatc 

<210> 174 

<211> 24 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 

<400> 174 

tgccaaccat gcatcaggta gccc 

<210> 175 

<211> 20 

<212> OKA 

<213> Artificial sequence 



<223> PCR primer 

<400> 175 

cagctcacct ggcaacttca 

<210> 176 

<211> 20 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 
<400> 176 

cctgattttc ccagcgatgt 
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<210> 177 

<211> 19 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 
<400> 177 

cgccgctccc ggttctgct 



<210> 178 

<211> 20 

<212> DNA 

<213> Artificial 



<223> PCR primer 

<400> 178 

tggccaagcg taagctgatt 

<210> 179 

<211> 21 

<212> DNA 

<213> Artificial sequence 



<223> PCR primer 

<400> 179 

gctgcagtga tcggatcatc t 

<210> 180 

<211> 22 

<212> DNA 

<213> Artificial 



219 



EP 1 365 034 A2 



<223> MLLT6 

<400> 180 

caccatggag eccatcgtgc tg 

<210> 181 

<211> 19 

<212> DNA 

<213> Artificial 



<223> MLLT6 for 

<400> 181 

atocccgagg tgcaatttg 

<210> 182 

<211> 21 

<212> DNA 

<213> Artificial Sequence 



<223> MLLT6 rev 

<400> 182 

agcgatcatg aggcacgtac t 

<210> 183 

<211> 29 

<212> DNA 

<213> Artificial Sequence 



<223> ZNF144 
<400> 183 

cctgccagag ataggagacc cagacagct 
<210> 184 
<211> 19 



220 
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<212> DNA 

<213> Artificial Sequence 



<223> ZNF144 for 

<400> 184 

atccccctga gccttttca 

<210> 185 

<211> 19 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> ZNF144 rev 

25 <400> 185 

cagcctctgg tcccaccat 

<210> 186 

3Q <211> 28 

<212> DHA 

<213> Artificial Sequence 

35 

<220> 

<223> PIP5K2B 

<400> 186 

tgatcatcaa ttccaaacct ctcccgaa 

<210> 187 

<211> 19 

<212> DNA 

<213> Artificial Sequence 



<223> PIP5K2B for 
<400> 187 

ccccatggtg ttccgaaac 



221 



EP 1 365 034 A2 



<210> 188 

5 <211> 19 

<212> DNA 

<213> Artificial Sequence 

w 



<223> PIP5K2B rev 

<400> 188 

tgccaggagc ctccataco 

<210> 189 

<211> 29 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> TEM7 

30 <400> 189 

cagccttcta aaacacaatg tattcatgt 29 

<210> 190 

<211> 29 

35 

<212> DNA 

<213> Artificial Sequence 



<223> TEM7 for 

<400> 190 

cctgaactta atggtagaat tcaaagatc 

<210> 191 

<211> 27 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> TEM7 rev 

5 <400> 191 

tattaacact gagaatccat gcagaga 

<210> 192 

w <211> 35 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> ZNFN1A3 

<400> 192 

tatctggtct cagggattgc tcctatgtat tcagc 

<210> 193 

<211> 20 

<212> DNA 

<213> Artificial Sequence 



<223> ZNFN1A3 for 

<400> 193 

cacagagccc tgctgaagtg 

<210> 194 

<211> 23 

<212> DNA 

<213> Artificial Sequence 



<223> ZNFN1A3 rev 
<400> 194 

gcgaggtcat tggtttttag aaa 
<210> 195 
<211> 22 



223 
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<212> DNA 

<213> Artificial Sequence 



<223> WIRE 

<400> 195 

ctgtgatccg aaatggtgcc ag 

<210> 196 

<211> 20 

<212> DNA 

<213> Artificial 



<223> WIRE for 

<400> 196 

ccgtctccac atccaaacct 

<210> 197 

<211> 20 

<212> DKA 

<213> Artificial Sequence 



<223> WIRE rev 

<400> 197 

aeccatgcat tcggtatggt 

<210> 198 

<211> 21 

<212> DNA 

<213> Artificial Sequence 



<223> PSMB3 
<400> 198 

agtggcacct gcgccgaaca a 



224 
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<210> 199 

5 <211> 21 

<212> DNA 

<213> Artificial Sequence 

10 



<223> PSMB3 for 

<400> 199 

ccccatggtg actgatgact t 

<210> 200 

<211> 21 

<212> DNA 

<213> Artificial Sequence 



<223> PSMB3 rev 

<400> 200 

ccagagggac tcacacattc < 

<210> 201 

<211> 29 

<212> DNA 

<213> Artificial 



<220> 



<223> MGC9753 
<400> 201 
45 ccagaaactt tccatcccaa aggcagtct 



<210> 202 
<211> 21 
<212> DNA 



<213> Artificial Sequence 



55 
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<223> MGC9753 for 

<400> 202 

ctgccccaca ggaatagaat g 

<210> 203 

<211> 23 

<212> DNA 

<213> ARTIFICIAL 



<223> MGC9753 rev 

<400> 203 

aaaaatccag tctgcttcaa cca 

<210> 204 

<211> 20 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 



<223> ORMDL3 

<400> 204 

agctgcccca gctccaogga 

<210> 205 

<211> 21 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 



<223> ORMDL3 for 
<400> 205 

tccctgatga gcgtgcttat c 
<210> 206 
<211> 28 



226 
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<212> DNA 

<213> ARTIFICIAL SEQUENCE 



<220> 

10 <223> ORMDL3 rev 

<400> 206 

tctcagtact tattgattcc aaaaatco 

<210> 207 

15 <211> 25 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 



<223> MGC15482 

<400> 207 

tGcagtggaa gcaaccccag tgttc 

<210> 208 

<211> 25 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 



<223> MGC15482 for 

<400> 208 

cacttctaga gctaccgtgg agtct 

<210> 209 

<211> 22 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 



<223> MGC15482 rev 
<400> 209 

ccctcacttt gtaacccttg ct 
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<210> 210 

<211> 20 

<212> DNA 

<213> ARTIFICIAL 



<210> 211 

<211> 21 

<212> DNA 

<213> ARTIFICIAL 



<223> PPP1R1B for 

<400> 211 

gggattgttt agccacacat a 

<210> 212 

<211> 20 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 



<223> PPP1R1B rev 

<400> 212 

ccgatgttaa ggcccatagc 

<210> 213 

<211> 27 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 



228 
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<223> MGC14832 

<400> 213 

taaaatgtcc ggccaaoatg agttccc 

<210> 214 

<211> 17 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 



<223> MGC14832 for 

<400> 214 
agcagtgcct ggcacat 

<210> 215 

<211> 20 

<212> DNA 

<213> ARTIFICIAL SE< 



<223> MGC14832 rev 

<400> 215 

gacaccccct gacctatgga 

<210> 216 

<211> 25 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 



<223> LOC51242 
<400> 216 

cagtgacctc tcccgttcoc ttgga 
<210> 217 
<211> 20 



229 
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<212> DNA 

<213> ARTIFICIAL SEQUENCE 



<220> 

10 <223> LOC51242 for 

<400> 217 

tgggtccctg tgtcctcttc 

<210> 218 

15 <211> 20 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 



<223> LOC51242 for 

<400> 218 

agggtcagga gggagaaaac 

<210> 219 

<211> 26 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 



<223> FLJ20291 

<400> 219 

ocagtgocca cccgttaaag agtcaa 

<210> 220 

<211> 24 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 



<223> FLJ20291 for 
<400> 220 

ttgtgggaca ctcagtaact ttgg 



230 
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<210> 221 

<211> 20 

<212> DNA 

<213> ARTIFICIAL 



<223> FLJ20291 rev 

<400> 221 

acaagcactc ccaccgagat 

<210> 222 

<211> 24 

<212> DHA 

<213> ARTIFICIAL SEQUENCE 



<223> PR02521 

<400> 222 

agtctgtcct cactgccatc gcca 

<210> 223 

<211> 21 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 



<223> PR02521 for 

<400> 223 

aagcctctgg gttttccott t 

<210> 224 

<211> 20 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 



231 
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<223> PR02521 rev 

<400> 224 

cccactggtg aoaggatggt 

<210> 225 

<211> 23 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 



<223> LINK-GEFII 

<400> 225 

catctgacat ctttcccgtg gag 

<210> 226 

<211> 21 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 



<223> LINK-GEFII for 

<400> 226 

ctttgcacga tgtctcaacc a 

<210> 227 

<211> 18 

<212> DNA 

<213> ARTIFICIAL SEQUENCE 



<223> LINK-GEFII rev 
<400> 227 
tttccogtgg agoaggaa 

<210> 228 

<211> 26 



232 
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<213> ARTIFICIAL SEQUENCE 



<223> CTEN 

<400> 228 

cogccgccta atatgcaaoa ttaggg 

<210> 229 

<2H> 23 

<212> DMA 

<213> ARTIFICIAL SEQUENCE 



<220> 

<223> CTEN for 

<400> 229 

25 cgagtattcc aaagctggta teg 23 

<210> 230 

<211> 24 

30 <212> DNA 

<213> ARTIFICIAL SEQUENCE 



<223> CTEN rev 

<400> 230 

atcacagaga gatggecett atct 

<210> 231 

<211> 25 

<212> DNA 

<213> Artificial 



<220> 

<223> D17S946 forward primer 
<400> 231 

acagtctatc aagcagaaaa atcct 25 



233 
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<210> 232 

<211> 16 

<212> DNA 

<213> Artificial Sequence 



<210> 233 

<211> 20 

<212> DNA 

<213> Artificial Sequence 



<223> D17S1181 forward primer 

<400> 233 

gacaacagag cgagactccc 

<210> 234 

<211> 20 

<212> DNA 

<213> Artificial 



<223> D17S1181 reverse primer 

<400> 234 

gcccagcctg tcacttattc 

<210> 235 

<211> 18 

<212> DNA 

<213> Artificial Sequence 



234 
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<223> D17S2026 forward primer 

<400> 235 
tggtcattcg acaacgaa 

<210> 236 

<211> 18 

<212> DNA 

<213> Artificial 



<223> D17S2026 reverse primer 

<400> 236 
cagcafctgga tgcaatcc 

<210> 237 

<211> 20 

<212> DNA 

<213> Artificial 



<223> D17S838 forward primer 

<400> 237 

ctccagaatc cagaccatga 

<210> 238 

<211> 20 

<212> DNA 

<213> Artificial Sequence 



<223> D17S838 reverse 
<400> 238 

aggacagtgt gtagcccttc 
<210> 239 
<211> 20 



235 
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<212> DNA 

<213> Artificial Sequence 



<223> D17S250 forward primer 

<400> 239 

ggaagaatca aatagacaat 

<210> 240 

<211> 24 

<212> DNA 

<213> Artificial 



<223> D17S250 reverse primer 

<400> 240 

gctggccata tatatattta aacc 

<210> 241 

<211> 23 

<212> DNA 

<213> Artificial Sequence 



<223> D17S1818 forward primer 

<400> 241 

cataggtatg ttcagaaatg tga 

<210> 242 

<211> 18 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> D17S1818 reverse primer 
<400> 242 
tgcctactgg aaaccaga 



236 



EP 1 365 034 A2 



<210> 243 

<211> 23 

<212> DNA 

<213> Artificial 



<223> D17S614 forward primer 

<400> 243 

aaggggaagg ggctttcaaa got 

<210> 244 

<211> 23 

<212> DNA 

<213> Artificial 



<223> D17S614 reverse primer 

<220> 

<221> misc_feature 

<222> (1) . . <1) 

<223> n=a, c r g or t 

<400> 244 

nggaggttgc agtgagccaa gat 

<210> 245 

<211> 23 

<212> DNA 

<213> Artificial Sequence 



<223> D17S2019 forward primer 
<400> 245 

caaaagctta tgatgctcaa acc 



237 
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<210> 246 

<211> 22 

<212> DNA 

<213> Artificial 



<223> D17S2019 reverse primer 

<400> 246 

ttgtttooot ttgactttct ga 

<210> 247 

<211> 25 

<212> DNA 

<213> Artificial 



<223> D17S608 forward primer 

<400> 247 

taggttcacc tctcattttc ttcag 

<210> 248 

<211> 24 

<212> DNA 

<213> Artificial 



<223> D17S608 reverse primer 
<220> 

<221> misc_feature 

<222> <17) . . (17) 

<223> n=a, c, g or t 

<400> 248 



238 
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<211> 20 
<212> DNA 

<213> Artificial Sequence 



<223> D17S1655 forward primer 

<400> 249 

cggaccagag tgttccatgg 

<210> 250 

<211> 20 

<212> DNA 

<213> Artificial 



<223> D17S1655 reverse prii 

<400> 250 

gcatacagca ccctctacct 

<210> 251 

<211> 25 

<212> DNA 

<213> Artificial Sequence 



<223> D17S2147 forward primer 

<400> 251 

aggggagaat aaataaaatc tgtgg 

<210> 252 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
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<223> D17S2147 reverse primer 

<400> 252 

eaggagtgag aoactctcca tg 

<210> 253 

<211> 22 

<212> DNA 

<213> Artificial 



<223> D17S754 forward primer 

<400> 253 

tggattcact gactoagcct gc 

<210> 254 

<211> 22 

<212> DNA 

<213> Artificial Sequence 



<223> D17S754 reverse primer 

<400> 254 

gcgtgtctgt ctccatgtgt gc 

<210> 255 

<211> 18 

<212> DHA 

<213> Artificial Sequence 



<223> D17S1814 forward primer 
<400> 255 
tccccaatga cggtgatg 

<210> 256 

<211> 20 
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<212> DNA 

<213> Artificial Sequence 



<223> D17S1814 reverse prii 

<400> 256 

ctggaggttg gcttgtggat 

<210> 257 

<211> 18 

<212> DNA. 

<213> Artificial 



<223> D17S2007 forward primer 

<400> 257 
ggtcccacga atttgctg 

<210> 258 

<211> 20 

<212> DNA 

<213> Artificial Sequence 



<223> D17S2007 reverse prii 

<400> 258 

ccacccagaa aaacaggaga 

<210> 259 

<211> 20 

<212> DNA 

<213> Artificial Sequence 



<223> D17S1246 forward primer 
<400> 259 

tcgatctcct gaccttgtga 
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<210> 260 

<211> 20 

<212> DNA 

<213> Artificial 



<223> D17S1246 reverse primer 

<400> 260 

ttgtcacccc attgcctttc 

<210> 261 

<211> 21 

<212> DNA 

<213> Artificial 



<223> D17S1979 forward prim 

<400> 261 

ccttggatag attcagctcc c 

<210> 262 

<211> 21 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> D17S1979 reverse primer 

<400> 262 
45 cttgtccctt ctcaatcctc c 

<210> 263 

<211> 25 

50 

<212> DNA 

<213> Artificial Sequence 
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<223> D17S1984 forward primer 

<400> 263 

ttaagcaagg ttttaattaa gctgc 

<210> 264 

<211> 21 

<212> DNA 

<213> Artifiaial 



<223> D17S1984 reverse primer 

<400> 264 

gattacagtg etacctctoo c 

<210> 265 

<2H> 22 

<212> DKA 

<213> Artificial 



<223> G11580 forward primer 

<400> 265 

ggttttaatt aagctgcatg gc 

<210> 266 

<211> 21 

<212> DNA 

<213> Artificial Sequence 



<223> G11580 reverse primer 
<400> 266 

gattacagtg ctccctctcc c 
<210> 267 
<211> 20 
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<212> DNA 

<213> Artificial Sequence 



<223> D17S1867 forward primer 

<400> 267 

agtttgacac tgaggctttg 

<210> 268 

<211> 20 

<212> DNA 

<213> Artificial Sequence 



<223> D17S1867 reverse primer 

<400> 268 

tttagacttg gtaactgccg 

<210> 269 

<211> 24 

<212> DNA 

<213> Artificial Sequence 



<223> D17S1788 forward primer 

<400> 269 

tgcagatgcc taagaacttt tcag 

<210> 270 

<211> 19 

<212> DNA 

<213> Artificial Sequence 



<223> D17S1788 reverse primer 
<400> 270 

gccatgatct cccaaagcc 
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<210> 271 

5 <211> 18 

<212> DNA 

<213> Artificial Sequence 

10 



<223> D17S1836 forward primer 

<400> 271 
tcgaggttat ggtgagec 

<210> 272 

<211> 24 

<212> DNA 

<213> Artificial 



<223> D17S1836 a 

<400> 272 

aaactgtgtg tgtcaaagga tact 

<210> 273 

<211> 19 

<212> DNA 

<213> Artificial 



<220> 

<223> D17S1787 forward primer 

<400> 273 
45 gctgatctga agccaatga 

<210> 274 

<211> 19 

50 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> D17S1787 reverse primer 

5 <400> 274 

tacatgaagg catggtctg 

<210> 275 

10 <211> 23 

<212> DNA 

<213> Artificial Sequence 

15 

<220> 

<223> D17S1660 forward primer 

20 <400> 275 

ctaatataat cctgggcaca tgg 

<210> 276 

<211> 18 

25 

<212> DNA 

<213> Artificial Sequence 



<223> D17S1660 reverse primer 

<400> 276 
gctgcggacc agacagat 

<210> 277 

<211> 22 

<212> DNA 

<213> Artificial 



<223> D17S2154 forward primer 
<400> 277 

gataaaaaca agcactggct cc 
<210> 278 
<211> 20 
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<212> DNA 

<213> Artificial 



<223> D17S2154 reverse primer 

<400> 278 

oocacggctt tcttgatcta 

<210> 279 

<211> 21 

<212> DNA 

<213> Artificial Sequence 



<223> D17S1955 forward primer 

<400> 279 

tgtaatgtaa gccccatgag g 

<210> 280 

<211> 25 

<212> DNA 

<213> Artificial Sequence 



<223> D17S1955 reverse primer 

<400> 280 

cactcaactc aacagtctaa aggtg 

<210> 281 

<211> 25 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> D17S2098 forward primer 
<400> 281 

gtgagttcaa gcatagtaat tatcc 
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<210> 282 

<211> 23 

<212> DNA 

<213> Artificial Sequence 



<223> D17S2098 reverse prim 

<400> 282 

attcagcctc agttcaetgc tte 

<210> 283 

<211> 20 

<212> DNA 

<213> Artificial Sequence 



<223> D17S518 forward primer 

<400> 283 

gatccagtgg agactcagag 

<210> 284 

<211> 20 

<212> OKA 

<213> Artificial 



<220> 

<223> D17S518 reverse primer 

<400> 284 

tagtctctgg gacacccaga 

<210> 285 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> D17S518 forward primer 

5 <400> 285 

attcotgagt gtctaccctg ttgag 

<210> 286 

10 <211> 17 

<212> DNA 

<213> Artificial Sequence 



<223> D17S518 reverse primer 

<400> 286 
actgactgcg ccactgc 

<210> 287 

<211> 20 

<212> DNA 

<213> Artificial 



<223> D11S4358 forward primer 

<400> 287 

tcgagaagga caaaatcacc 

<210> 288 

<211> 20 

<212> DNA 

<213> Artificial 



<223> D11S4358 reverse primer 
<400> 288 

gaacagggtt agtccattcg 
<210> 289 
<211> 19 
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<212> DNA 

<213> Artificial Sequence 



<223> D17S964 forward primer 

<400> 289 

gttctttcct cttgtgggg 

<210> 290 

<211> 19 

<212> DNA 

<213> Artificial Sequence 



<223> D17S964 reverse primer 

<400> 290 

agtcagctga gattgtgcc 

<210> 291 

<211> 20 

<212> DNA 

<213> Artificial Sequence 



<223> D19S1091 forward primer 

<400> 291 

caagccaaga catcccagtt 

<210> 292 

<211> 20 

<212> DNA 

<213> Artificial Sequence 



<223> D19S1091 reverse primer 
<400> 292 

ccccacacac agctcatatg 
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<210> 293 

<211> 22 

<212> DNA 

<213> Artificial Sequanoi 



<223> D17S1179 forward primer 

<400> 293 

ttttctctct cattccattg gg 

<210> 294 

<211> 20 

<212> DNA 

<213> Artificial Sequence 



<223> D17S1179 J 

<400> 294 

gcaacagagg gagactccaa 

<210> 295 

<211> 19 

<212> DNA 

<213> Artificial 



<220> 

<223> D10S2160 forward primer 

<400> 295 
45 tcccatcccg taagacctc 

<210> 296 

<211> 25 

50 

<212> DNA 

<213> Artificial Sequence 
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<223> D10S2160 reverse primer 

<400> 296 

tatggagtac ctactctatg coagg 

<210> 297 

<211> 20 

<212> DNA 

<213> Artificial 



<223> D17S1230 forward primer 

<400> 297 

attcaaagct ggatoocttt 

<210> 298 

<211> 20 

<212> DNA 

<213> Artificial 



<220> 

<223> D17S1230 reverse primer 
<400> 298 

agctgtgaca aatgcctgta 



<210> 299 

<211> 20 

<212> DNA 

<213> Artificial Sequence 



<220> 



<223> D17S1338 forward primer 
<400> 299 

tcacctgaga ttgggagacc 



<210> 300 
<211> 18 

55 
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<212> DNA 

<213> Artificial Sequence 



<220> 

10 <223> D17S1338 reverse primer 

<400> 300 
aagatggggc aggaatgg 

<210> 301 

15 <211> 19 

<212> DNA 

<213> Artificial Sequence 



<223> D17S2011 forward primer 

<400> 301 

teactgtcct ccaagccag 

<210> 302 

<211> 20 

<212> DNA 

<213> Artificial Sequence 



<223> D17S2011 : 

<400> 302 

aaacaccaca ctctcccctg 

<210> 303 

<211> 20 

<212> DNA 

<213> Artificial Sequence 



<223> D17S2011 forward primer 
<400> 303 

ttcttgggct tcccgtagcc 
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<210> 304 

5 <211> 20 

<212> DNA 

<213> Artificial Sequence 

w 



<223> D17S2011 reverse primer 

<400> 304 

ggggcagacg acttctcctt 

<210> 305 

<211> 23 

<212> DNA 

<213> Artificial Sequence 



<223> D17S2038 forward prii 

<400> 305 

ggggatacaa cctttaaagt tec 

<210> 306 

<211> 25 

<212> DNA 

<213> Artificial Sequence 



<223> D17S2038 reverse prim< 

<400> 306 

attcacctaa tgaggattct tcttt 

<210> 307 

<211> 24 

<212> DNA 

<213> Artificial Sequence 
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<223> D17S2091 forward primer 

<400> 307 

gctgaaatag ccatcttgag ctac 

<210> 308 

<211> 23 

<212> DNA 

<213> Artificial 



<223> D17S2091 reverse primer 

<400> 308 

tccgcatcct ttttaagagg cac 

<210> 309 

<211> 24 

<212> DNA 

<213> Artificial Sequence 



ctttcactct ttcagctgaa gagg 
<210> 310 
<211> 25 
<212> DNA 
<213> Artificial ! 



<223> D17S649 reverse primer 
<400> 310 

tgacgtgcta tttcctgttt tgtct 
<210> 311 
<211> 18 
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<212> DNA 

<213> Artificial Sequence 



<220> 



Number of pages 



<220> 

<223> D17S1190 reverse primer 
<400> 312 
caacacacta ccccaocra 



Claims 



1 . A method for the prediction, diagnosis or prognosis of malignant neoplasia by the detection of at least 2 markers 
characterized in that the markers are genes and fragments thereof or genomic nucleic acid sequences that are 
located on one chromosomal region which is altered in malignant neoplasia. 

2. A method for the prediction, diagnosis or prognosis of malignant neoplasia by the detection of at least 2 markers 
characterized in that the markers are: 

a) genes that are located on one or more chromosomal region(s) which is/are altered in malignant neoplasia; 



b) 

i) receptor and ligand; or 

ii) members of the same signal transduction pathway; or 

iii) members of synergistic signal transduction pathways; or 

iv) members of antagonistic signal transduction pathways; or 

v) transcription factor and transcription factor binding site. 

3. The method of claim 1 or 2 wherein the malignant neoplasia is breast cancer, ovarian cancer, gastric cancer, colon 
cancer, esophageal cancer, mesenchymal cancer, bladder cancer or non-small cell lung cancer. 

4. The method of claim 1 or 2 wherein at least one chromosomal region is defined as the cytogenetic region: 1p13, 
1q32, 3p21-p24, 5p13-p14, 8q23-q 24 , 1 1q13, 12q13,17q12-q24 or 20q13. 



256 



EP 1 365 034 A2 

5. The method of claim 1 or 2 wherein at least chromosomal region is defined as the cytogenetic region 17q1 1.2-21.3 
and the malignant neoplasia is breast cancer, ovarian cancer, gastric cancer, colon cancer, esophageal cancer, 
mesenchymal cancer, bladder cancer or non-small cell lung cancer. 

6. The method of claim 1 or 2 wherein at least one chromosomal region is defined as the cytogenetic region 3p21-24 
and the malignant neoplasia is breast cancer, ovarian cancer, gastric cancer, colon cancer, esophageal cancer, 
mesenchymal cancer, bladder cancer or non-small cell lung cancer. 

7. The method of claim 1 or 2 wherein at least one chromosomal region is defined as the cytogenetic region 12q13 
and the malignant neoplasia is breast cancer, ovarian cancer, gastric cancer, colon cancer, esophageal cancer, 
mesenchymal cancer, bladder cancer or non-small cell lung cancer. 

8. A method for the prediction, diagnosis or prognosis of malignant neoplasia by the detection of at least one marker 
whereby the marker is a VNTR, SNP, RFLP or STS characterized in that the marker is located on one chromo- 
somal region which is altered in malignant neoplasia due to amplification and the marker is detected in a cancerous 
and a non-cancerous tissue or biological sample of the same individual. 

9. The method of claim 8 wherein the marker is selected from the group consisting of the VNTRs: 

D17S946, D17S1181, D17S2026, D17S838, D17S250, D17S1818, D17S614, D17S2019, D17S608, 
D17S1655, D17S2147, D17S754, D17S1814, D17S2007, D17S1246, D17S1979, D17S1984, D17S1984, 
D17S1867, D17S1788, D17S1836, D17S1787, D17S1660, D17S2154, D17S1955, D17S2098, D17S518, 
D17S1851, D11S4358, D17S964, D19S1091, D17S1179, D10S2160, D17S1230, D17S1338, D17S2011, 
D17S1237, D17S2038, D17S2091, D17S649, D17S1190 and M87506. 

10. The method of claim 8 wherein the marker is selected from the group consisting of the SNPs: 

rs2230698, rs2230700, rs1058808, rs1801200, rs903506, rs2313170, rs1136201, rs2934968, rs2172826, 
rs1810132, rs1801201, rs2230702, rs2230701, rs1 126503, rs3471,rs 13695, rs471692, rs558068, rs1 064288, 
rs1061692, rs520630, rs782774, rs565121, rs2586112, rs532299, rs2732786, rs1804539, rs1804538, 
rs1804537, rs1141364 rs12231, rs1132259 rs1132257, rs113225 rs113225 rs1132254, rs113225 rs1132268 
and rs11322 

1 1 . A method for the prediction, diagnosis or prognosis of malignant neoplasia by the detection of at least one marker 
characterized in that the marker is selected from: 

a) a polynucleotide or polynucleotide analog comprising at least one of the sequences of SEQ ID NO: 2 to 6, 
8, 9, 11 to 16, 18, 19, 21 to 26 or 53 to 75 ; 

b) a polynucleotide or polynucleotide analog which hybridizes under stringent conditions to a polynucleotide 
specified in (a) and encodes a polypeptide exhibiting the same biological function as specified for the respective 
sequence in Table 2 or 3 

c) a polynucleotide or polynucleotide analog the sequence of which deviates from the polynucleotide specified 
in (a) and (c) due to the generation of the genetic code encoding a polypeptide exhibiting the same biological 
function as specified for the respective sequence in Table 2 or 3 

d) a polynucleotide or polynucleotide analog which represents a specific fragment, derivative or allelic variation 
of a polynucleotide sequence specified in (a) to (d) 

e) a purified polypeptide encoded by a polynucleotide or polynucleotide analog sequence specified in (a) to (e) 

f) a purified polypeptide comprising at least one of the sequences of SEQ ID NO: 28 to 32, 34, 35, 37 to 42, 
44, 45, 47 to 52 or 76 to 98; 

are detected. 

12. A method for the prediction, diagnosis or prognosis of malignant neoplasia by the detection of at least 2 markers 
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characterized in that at least 2 markers are selected from: 

a) a polynucleotide or polynucleotide analog comprising at least one of the sequences of SEQ ID NO: 1 to 26 
or 53 to 75; 

b) a polynucleotide or polynucleotide analog which hybridizes under stringent conditions to a polynucleotide 
specified in (a) and encodes a polypeptide exhibiting the same biological function as specified for the respective 
sequence in Table 2 or 3 

c) a polynucleotide or polynucleotide analog the sequence of which deviates from the polynucleotide specified 
in (a) and (b) due to the generation of the genetic code encoding a polypeptide exhibiting the same biological 
function as specified for the respective sequence in Table 2 or 3 

d) a polynucleotide or polynucleotide analog which represents a specific fragment, derivative or allelic variation 
of a polynucleotide sequence specified in (a) to (c) 

e) a purified polypeptide encoded by a polynucleotide sequence or polynucleotide analog specified in (a) to (d) 

f) a purified polypeptide comprising at least one of the sequences of SEQ ID NO: 27 to 52 or 76 to 98 
are detected. 

13. The method of any of the claims 1 or 12 wherein the detection method comprises the use of PCR, arrays or beads. 

14. A diagnostic kit comprising instructions for conducting the method of any of claims 1 to 13. 

15. A composition for the prediction, diagnosis or prognosis of malignant neoplasia comprising: 

a) a detection agent for: 

i) any polynucleotide or polynucleotide analog comprising at least one of the sequences of SEQ ID NO: 
2 to 6, 8,9, 11 to 16, 18,19,21 to 26 or 53 to 75; 

ii) any polynucleotide or polynucleotide analog which hybridizes under stringent conditions to a polynu- 
cleotide specified in (a) encoding a polypeptide exhibiting the same biological function as specified for the 
respective sequence in Table 2 or 3 

iii) a polynucleotide or polynucleotide analog the sequence of which deviates from the polynucleotide 
specified in (a) and (b) due to the generation of the genetic code encoding a polypeptide exhibiting the 
same biological function as specified for the respective sequence in Table 2 or 3 

iv) a polynucleotide or polynucleotide analog which represents a specific fragment, derivative or allelic 
variation of a polynucleotide sequence specified in (a) to (c) 

v) a polypeptide encoded by a polynucleotide or polynucleotide analog sequence specified in (a) to (d); 

vi) a polypeptide comprising at least one of the sequences of SEQ ID NO: 28 to 32, 34, 35, 37 to 42, 44, 
45, 47 to 52 or 76 to 98. 



b) at least 2 detection agents for at least 2 markers selected from: 

i) any polynucleotide comprising at least one of the sequences of SEQ ID NO: 1 to 26 or 53 to 75; 

ii) any polynucleotide which hybridizes under stringent conditions to a polynucleotide specified in (a) en- 
coding a polypeptide exhibiting the same biological function as specified for the respective sequence in 
Table 2 or 3 
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iii) a polynucleotide the sequence of which deviates from the polynucleotide specified in (a) and (b) due 
to the generation of the genetic code encoding a polypeptide exhibiting the same biological function as 
specified for the respective sequence in Table 2 or 3 

iv) a polynucleotide which represents a specific fragment, derivative or allelic variation of a polynucleotide 
sequence specified in (a) to (c) 

v) a polypeptide encoded by a polynucleotide sequence specified in (a) to (d); 

vi) a polypeptide comprising at least one of the sequences of SEQ ID NO: 27 to 52 or 76 to 98. 

An array comprising a plurality of polynucleotides or polynucleotide analogs wherein each of the polynucleotides 
is selected from: 

a) a polynucleotide or polynucleotide analog comprising at least one of the sequences of SEQ ID NO: 1 to 26 
or 53 to 75; 

b) a polynucleotide or polynucleotide analog which hybridizes under stringent conditions to a polynucleotide 
specified in (a) encoding a polypeptide exhibiting the same biological function as specified for the respective 
sequence in Table 2 or 3 

c) a polynucleotide or polynucleotide analog the sequence of which deviates from the polynucleotide specified 
in (a) and (b) due to the generation of the genetic code encoding a polypeptide exhibiting the same biological 
function as specified for the respective sequence in Table 2 or 3 

d) a polynucleotide or polynucleotide analog which represents a specific fragment, derivative or allelic variation 
of a polynucleotide sequence specified in (a) to (c) 

attached to a solid support. 

A method of screening for agents which regulate the activity of a polypeptide encoded by a polynucleotide or 
polynucleotide analog selected from the group consisting of: 

a) a polynucleotide or polynucleotide analog comprising at least one of the sequences of SEQ ID NO: 2 to 6, 
8, 9, 11 to 16, 18, 19, 21 to 26 or 53 to 75; 

b) a polynucleotide or polynucleotide analog which hybridizes under stringent conditions to a polynucleotide 
specified in (a) encoding a polypeptide exhibiting the same biological function as specified for the respective 
sequence in Table 2 or 3 

c) a polynucleotide or polynucleotide analog the sequence of which deviates from the polynucleotide specified 
in (a) and (b) due to the generation of the genetic code encoding a polypeptide exhibiting the same biological 
function as specified for the respective sequence in Table 2 or 3 

d) a polynucleotide or polynucleotide analog which represents a specific fragment, derivative or allelic variation 
of a polynucleotide sequence specified in (a) to (c); 

comprising the steps of: 

i) contacting a test compound with at least one polypeptide encoded by a polynucleotide specified in (a) to 
(d); and 

ii) detecting binding of the test compound to the polypeptide, wherein a test compound which binds to the 
polypeptide is identified as a potential therapeutic agent for modulating the activity of the polypeptide in order 
to prevent of treat malignant neoplasia. 

A method of screening for agents which regulate the activity of a polypeptide encoded by a polynucleotide or 
polynucleotide analog selected from the group consisting of: 
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a) a polynucleotide or polynucleotide analog comprising at least one of the sequences of SEQ ID NO: 2 to 6, 
8,9, 11 to 16, 18, 19,21 to 26 or 53 to 75; 

b) a polynucleotide or polynucleotide analog which hybridizes under stringent conditions to a polynucleotide 
5 specified in (a) encoding a polypeptide exhibiting the same biological function as specified for the respective 

sequence in Table 2 or 3 

c) a polynucleotide or polynucleotide analog the sequence of which deviates from the polynucleotide specified 
in (a) and (b) due to the generation of the genetic code encoding a polypeptide exhibiting the same biological 

10 function as specified for the respective sequence in Table 2 or 3 

d) a polynucleotide or polynucleotide analog which represents a specific fragment, derivative or allelic variation 
of a polynucleotide sequence specified in (a) to (c) 

15 comprising the steps of: 

i) contacting a test compound with at least one polypeptide encoded by a polynucleotide specified in (a) to 
(d); and 

20 ii) detecting the activity of the polypeptide as specified for the respective sequence in Table 2 or 3, wherein a 

test compound which increases the activity is identified as a potential preventive or therapeutic agent for 
increasing the polypeptide acitivity in malignant neoplasia, and wherein a test compound which decreases the 
activity of the polypeptide is identified as a potential therapeutic agent for decreasing the polypeptide activity 
in malignant neoplasia. 

25 

1 9. A method of screening for agents which regulate the activity of a polynucleotide or polynucleotide analog selected 
from group consisting of; 

a) a polynucleotide or polynucleotide analog comprising at least one of the sequences of SEQ ID NO: 2 to 6, 
30 8, 9, 11 to 16, 18, 19, 21 to 26 or 53 to 75; 

b) a polynucleotide or polynucleotide analog which hybridizes under stringent conditions to a polynucleotide 
specified in (a) encoding a polypeptide exhibiting the same biological function as specified for the respective 
sequence in Table 2 or 3 

35 

c) a polynucleotide or polynucleotide analog the sequence of which deviates from the polynucleotide specified 
in (a) and (b) due to the generation of the genetic code encoding a polypeptide exhibiting the same biological 
function as specified for the respective sequence in Table 2 or 3 

40 d) a polynucleotide or polynucleotide analog which represents a specific fragment, derivative or allelic variation 

of a polynucleotide sequence specified in (a) to (c) 

comprising the steps of: 

45 i) contacting a test compound with at least one polynucleotide or polynucleotide analog specified in (a) to (d), 

and 

ii) detecting binding of the test compound to the polynucleotide, wherein a test compound which binds to the 
polynucleotide is identified as a potential preventive or therapeutic agent for regulating the activity of the poly- 

50 nucleotide in malignant neoplasia. 

20. Use of 

a) a polynucleotide or polynucleotide analog comprising at least one of the sequences of SEQ ID NO: 2 to 6, 
55 8, 9, 11 to 16, 18, 19, 21 to 26 or 53 to 75; 

b) a polynucleotide which hybridizes under stringent conditions to a polynucleotide or polynucleotide analog 
specified in (a) encoding a polypeptide exhibiting the same biological function as specified for the respective 
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sequence in Table 2 or 3; 

c) a polynucleotide or polynucleotide analog the sequence of which deviates from the polynucleotide specified 
in (a) and (b) due to the generation of the genetic code encoding a polypeptide exhibiting the same biological 
function as specified for the respective sequence in Table 2 or 3; 

d) a polynucleotide or polynucleotide analog which represents a specific fragment, derivative or allelic variation 
of a polynucleotide sequence specified in (a) to (c); 

e) an antisense molecule targeting specifically one of the polynucleotide sequences specified in (a) to (d); 

f) a purified polypeptide encoded by a polynucleotide or polynucleotide analog sequence specified in (a) to (d) 

g) a purified polypeptide comprising at least one of the sequences of SEQ ID NO: 28 to 32, 34, 35, 37 to 42, 

44, 45, 47 to 52 or 76 to 98; 

h) an antibody capable of binding to one of the polynucleotide specified in (a) to (d) or a polypeptide specified 
in (f) and (g); 

i) a reagent identified by any of the methods of claim 17 to 19 that modulates the amount or activity of a 
polynucleotide sequence specified in (a) to (d) or a polypeptide specified in (f) and (g); 

in the preparation of a composition for the prevention, prediction, diagnosis, prognosis or a medicament for the 
treatment of malignant neoplasia. 

21 . Use of claim 20 wherein the disease is breast cancer. 

22. A reagent that regulates the activity of a polypeptide selected from the group consisting of: 

a) a polypeptide encoded by any polynucleotide or polynucleotide analog comprising at least one of the se- 
quences of SEQ ID NO: 2 to 6, 8, 9, 11 to 16, 18, 19, 21 to 26 or 53 to 75; 

b) a polypeptide encoded by any polynucleotide or polynucleotide analog which hybridizes under stringent 
conditions to any polynucleotide comprising at least one of the sequences of SEQ ID NO: 2 to 6, 8, 9, 11 to 
16, 18, 19, 21 to 26 or 53 to 75 encoding a polypeptide exhibiting the same biological function as specified for 
the respective sequence in Table 2 or 3 

c) a polypeptide encoded by any polynucleotide or polynucleotide analog the sequence of which deviates from 
the polynucleotide specified in (a) and (b) due to the generation of the genetic code encoding a polypeptide 
exhibiting the same biological function as specified for the respective sequence in Table 2 or 3 

d) a polypeptide encoded by any polynucleotide or polynucleotide analog which represents a specific fragment, 
derivative or allelic variation of a polynucleotide sequence specified in (a) to (c)_encoding a polypeptide ex- 
hibiting the same biological function as specified for the respective sequence in Table 2 or 3 

e) or a polypeptide comprising at least one of the sequences of SEQ ID NO: 28 to 32, 34, 35, 37 to 42, 44, 

45, 47 to 52 or 76 to 98; 

wherein said reagent is identified by the method of any of the claims 17 to 19. 

23. A reagent that regulates the activity of a polynucleotide or polynucleotide analog selected from the group consisting 



a) a polynucleotide or polynucleotide analog comprising at least one of the sequences SEQ ID NO: 2 to 6, 8, 
55 9, 11 to 16, 18, 19,21 to 26 or 53 to 75; 

b) a polynucleotide or polynucleotide analog which hybridizes under stringent conditions to a polynucleotide 
specified in (a) encoding a polypeptide exhibiting the same biological function as specified for the respective 
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sequence in Table 2 or 3 

c) a polynucleotide or polynucleotide analog the sequence of which deviates from the polynucleotide specified 
in (a) and (b) due to the generation of the genetic code encoding a polypeptide exhibiting the same biological 

5 function as specified for the respective sequence in Table 2 or 3 

d) a polynucleotide or polynucleotide analog which represents a specific fragment, derivative or allelic variation 
of a polynucleotide sequence specified in (a) to (c)_encoding a polypeptideexhibiting the same biological 
function as specified for the respective sequence in Table 2 or 3 

w 

wherein said reagent is identified by the method of any of the claims 17 to 19. 

24. A pharmaceutical composition, comprising: 

15 a) an expression vector containing at least one polynucleotide or polynucleotide analog selected from the 

group consisting of: 

i) a polynucleotide or polynucleotide analog comprising at least one of the sequences of SEQ ID NO: 2 
to 6, 8, 9, 11 to 16, 18, 19,21 to 26 or 53 to 75; 

20 

ii) a polynucleotide or polynucleotide analog which hybridizes under stringent conditions to a polynucle- 
otide specified in (a) encoding a polypeptide exhibiting the same biological function as specified for the 
respective sequence in Table 2 or 3 

25 iii) a polynucleotide or polynucleotide analog the sequence of which deviates from the polynucleotide 

specified in (a) and (b) due to the generation of the genetic codeencoding a polypeptide exhibiting the 
same biological function as specified for the respective sequence in Table 2 or 3 

iv) a polynucleotide or polynucleotide analog which represents a specific fragment, derivative or allelic 
30 variation of a polynucleotide sequence specified in (a) to (c)_encoding a polypeptide exhibiting the same 

biological function as specified for the respective sequence in Table 2 or 3; 

or the reagent of claim 22 or 23 and a pharmaceutical^ acceptable carrier. 

35 25. A computer-readable medium comprising: 

a) at least one digitally encoded value representing a level of expression of at least one polynucleotide se- 
quence of SEQ ID NO: 2 to 6, 8, 9, 11 to 16,18,19,21 to 26 or 53 to 75 

40 b) al least 2 digitally encoded values representing the levels of expression of at least 2 polynucleotide se- 

quences selected from SEQ ID NO: 1 to 26 or 53 to 75 

in a cell from the a subject at risk for or having malignant neoplasia. 

45 26. A method for the detection of chromosomal alterations characterized in that the relative abundance of individual 
mRNAs, encoded by genes, located in altered chromosomal regions is detected. 

27. A method for the detection of chromosomal alterations characterized in that the copy number of one or more 
chromosomal region(s) is detected by quantitative PCR. 

50 
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